RTL (GS rasterizer, EE core stub, platform bridge, LPDDR4B path), sim regression (272 TBs), docs, and tooling. Copyrighted PS2 content (BIOS, game code, GS dumps, and all dump-derived textures/traces) is excluded via .gitignore and stays local. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
5.0 KiB
Ch284 closeout — LD (Load Doubleword); next blocker is syscall $v1=0x40
Status: Closed. Verdict from re-running qbert.elf:
elf_first_unhandled_syscall (pc=0x00111D24 $v1=0x40 (=64)). qbert
got past the function-epilogue ld $ra, 0($sp) at PC 0x00113378
plus 23 more instructions, then hit a SYSCALL whose $v1=64 isn't
in Ch273's HLE dispatcher (which handles 0x3C / 0x3D / 0x64).
retire_count: 27,067 → 27,091 (+24).
What landed
LD as the structural read-side of SD. The same sq_beat counter that
SD reuses (terminal beat = 1) now drives LD; the same beat-addressed
map_rd_addr = ea + sq_beat*4 already in place for LQ also serves
LD. Beat 0 captures mem[ea+0] into gpr128[rt][31:0] and mirrors
to regfile[rt]; beat 1 captures mem[ea+4] into
gpr128[rt][63:32]. gpr128[rt][127:64] is left untouched (LD only
loads doubleword; the upper 64 of $rt are architecturally preserved
on R5900).
RTL — surgical edits in ee_core_stub.sv
localparam OP_LD = 6'h37alongside OP_SD.logic ... is_ld;decl +is_ld = (opcode == OP_LD)decode.is_dword_access = is_sd || is_ld— picks up the existing 8-byte alignment fault path. AdEL is emitted for misaligned LD (sinceis_align_storestays SD-only).is_nop_classexclusion adds!is_ld.map_rd_addrbeat-stepping condition broadened fromis_lqto(is_lq || is_ld).- Dispatch arm: when
is_ld, setsq_beat <= 0then go toS_MEM_REQ(parallel to LQ). S_MEM_WAITmulti-beat branch generalized from "LQ only" to "LQ || LD" with aterminal_beatlocal:is_lq ? 3 : 1. Both ops share the same lane-capture case statement.
Five RTL touchpoints — purely structural reuse of the Ch283 gpr128
- Ch271 sq_beat machinery.
Focused TB — tb_ee_core_ld.sv
- Case 2 (round-trip, runs first): SD $ra(=0xABCD1234), 0($v0). LD $t2, 0($v0). Verify regfile[$t2]=0xABCD1234 and gpr128[$t2][63:32]=0 (SD beat 1 wrote 0).
- Case 1 (exact qbert encoding, runs LAST so $ra holds the LD
result): $sp set to 0x80000400; RAM pre-poked with
(0xAABBCCDD, 0x11223344)at ea/ea+4. Encoder assertsenc_i(OP_LD, 29, 31, 0) === 0xDFBF0000(matches qbert's exact PC 0x00113378 instruction). LD executes; in-program BNE compares $ra to 0xAABBCCDD; post-halt peeks confirmgpr128[$ra][31:0] = 0xAABBCCDDandgpr128[$ra][63:32] = 0x11223344.
(Initial draft of the TB mis-decoded 0xDFBF0000 as ld $ra, 0($ra);
the encoder-output assertion caught the mistake immediately — the
same pattern that caught Ch278 PCPYLD's mis-decode. The correct
encoding is ld $ra, 0($sp) — function epilogue restoring $ra from
the stack frame.)
Result: retired=20 halt=1 trap=0 errors=0 PASS.
Makefile + regression
tb_ee_core_ldtarget.- Added to both PHONY list and
run:master list. - Regression: 171 → 172.
qbert progression
| Chapter | Blocker | qbert retire_count |
|---|---|---|
| Post-Ch282 (PAND) | PCPYUD at 0x00112CA0 | 27,024 |
| Post-Ch283 (PCPYUD + gpr128) | LD at 0x00113378 | 27,067 |
| Post-Ch284 (LD) | SYSCALL $v1=0x40 at 0x00111D24 | 27,091 |
qbert is now executing through function returns. The next blocker is
syscall #64 with $a0 = 0x001DFFC0 (looks like a heap-top
address — possibly a memory-management or thread-context call) and
$a1 = 0x0011C326. Ch285 framing: add the 0x40 case to Ch273's
syscall HLE dispatcher (mirror the existing EndOfHeap / InitMainThread
/ FlushCache pattern). Open question for Codex: what is syscall 64?
The standard PS2 kernel syscall table is well-documented; Codex can
identify the exact service and the right stub-return semantics.
Files changed
rtl/ee/ee_core_stub.sv— 7 surgical edits (decode, alignment, dispatch, multi-beat S_MEM_WAIT generalization).sim/tb/integration/tb_ee_core_ld.sv— new focused TB.sim/Makefile— target + both regression lists.
Pattern review (14 chapters)
| Ch | Blocker | Edits | Pattern |
|---|---|---|---|
| 271 | SQ | 5 | NEW 4-beat write |
| 272 | DADDU | 4 | NEW ALU-low-32 |
| 273 | SYSCALL HLE | 2 | NEW gated dispatcher |
| 274 | BEQL | 6 | NEW branch+squash |
| 275 | SD | 7 | REUSE SQ counter |
| 276 | DSLL | 4 | REUSE DADDU |
| 277 | BNEL | 6 | REUSE BEQL squash |
| 278 | PCPYLD | 4 | NEW MMI narrow-decode |
| 279 | LQ | 5 | REUSE LW path |
| 280 | PSUBB | 5 | REUSE MMI narrow (byte-SIMD new) |
| 281 | PNOR | 5 | REUSE MMI narrow + NOR arm |
| 282 | PAND | 5 | REUSE MMI narrow + AND arm |
| 283 | PCPYUD + gpr128 | architectural | NEW 128-bit shadow |
| 284 | LD | 7 | REUSE Ch283 multi-beat path |
Ch283's "one-time architectural investment" already paying off: LD landed by extending the LQ/SQ/SD multi-beat machinery, not by inventing new infrastructure. Future doubleword/multi-beat ops will follow the same pattern.
Regression
172/172 PASS (was 171/171 in Ch283).