ec82764bef
RTL (GS rasterizer, EE core stub, platform bridge, LPDDR4B path), sim regression (272 TBs), docs, and tooling. Copyrighted PS2 content (BIOS, game code, GS dumps, and all dump-derived textures/traces) is excluded via .gitignore and stays local. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
118 lines
5.0 KiB
Markdown
118 lines
5.0 KiB
Markdown
# Ch284 closeout — LD (Load Doubleword); next blocker is syscall $v1=0x40
|
|
|
|
**Status:** Closed. **Verdict from re-running qbert.elf:**
|
|
`elf_first_unhandled_syscall (pc=0x00111D24 $v1=0x40 (=64))`. qbert
|
|
got past the function-epilogue `ld $ra, 0($sp)` at PC 0x00113378
|
|
plus 23 more instructions, then hit a SYSCALL whose `$v1=64` isn't
|
|
in Ch273's HLE dispatcher (which handles 0x3C / 0x3D / 0x64).
|
|
retire_count: 27,067 → **27,091** (+24).
|
|
|
|
## What landed
|
|
|
|
LD as the structural read-side of SD. The same `sq_beat` counter that
|
|
SD reuses (terminal beat = 1) now drives LD; the same beat-addressed
|
|
`map_rd_addr = ea + sq_beat*4` already in place for LQ also serves
|
|
LD. Beat 0 captures `mem[ea+0]` into `gpr128[rt][31:0]` and mirrors
|
|
to `regfile[rt]`; beat 1 captures `mem[ea+4]` into
|
|
`gpr128[rt][63:32]`. `gpr128[rt][127:64]` is left untouched (LD only
|
|
loads doubleword; the upper 64 of $rt are architecturally preserved
|
|
on R5900).
|
|
|
|
## RTL — surgical edits in `ee_core_stub.sv`
|
|
|
|
1. `localparam OP_LD = 6'h37` alongside OP_SD.
|
|
2. `logic ... is_ld;` decl + `is_ld = (opcode == OP_LD)` decode.
|
|
3. `is_dword_access = is_sd || is_ld` — picks up the existing
|
|
8-byte alignment fault path. AdEL is emitted for misaligned LD
|
|
(since `is_align_store` stays SD-only).
|
|
4. `is_nop_class` exclusion adds `!is_ld`.
|
|
5. `map_rd_addr` beat-stepping condition broadened from `is_lq` to
|
|
`(is_lq || is_ld)`.
|
|
6. Dispatch arm: when `is_ld`, set `sq_beat <= 0` then go to
|
|
`S_MEM_REQ` (parallel to LQ).
|
|
7. `S_MEM_WAIT` multi-beat branch generalized from "LQ only" to
|
|
"LQ || LD" with a `terminal_beat` local: `is_lq ? 3 : 1`. Both
|
|
ops share the same lane-capture case statement.
|
|
|
|
Five RTL touchpoints — purely structural reuse of the Ch283 gpr128
|
|
+ Ch271 sq_beat machinery.
|
|
|
|
## Focused TB — `tb_ee_core_ld.sv`
|
|
|
|
- **Case 2 (round-trip, runs first):** SD $ra(=0xABCD1234), 0($v0).
|
|
LD $t2, 0($v0). Verify regfile[$t2]=0xABCD1234 and
|
|
gpr128[$t2][63:32]=0 (SD beat 1 wrote 0).
|
|
- **Case 1 (exact qbert encoding, runs LAST so $ra holds the LD
|
|
result):** $sp set to 0x80000400; RAM pre-poked with
|
|
`(0xAABBCCDD, 0x11223344)` at ea/ea+4. Encoder asserts
|
|
`enc_i(OP_LD, 29, 31, 0) === 0xDFBF0000` (matches qbert's exact
|
|
PC 0x00113378 instruction). LD executes; in-program BNE compares
|
|
$ra to 0xAABBCCDD; post-halt peeks confirm
|
|
`gpr128[$ra][31:0] = 0xAABBCCDD` and `gpr128[$ra][63:32] = 0x11223344`.
|
|
|
|
(Initial draft of the TB mis-decoded 0xDFBF0000 as `ld $ra, 0($ra)`;
|
|
the encoder-output assertion caught the mistake immediately — the
|
|
same pattern that caught Ch278 PCPYLD's mis-decode. The correct
|
|
encoding is `ld $ra, 0($sp)` — function epilogue restoring $ra from
|
|
the stack frame.)
|
|
|
|
Result: `retired=20 halt=1 trap=0 errors=0 PASS`.
|
|
|
|
## Makefile + regression
|
|
|
|
- `tb_ee_core_ld` target.
|
|
- Added to both PHONY list and `run:` master list.
|
|
- Regression: 171 → **172**.
|
|
|
|
## qbert progression
|
|
|
|
| Chapter | Blocker | qbert retire_count |
|
|
|---------|---------|---------------------|
|
|
| Post-Ch282 (PAND) | PCPYUD at 0x00112CA0 | 27,024 |
|
|
| Post-Ch283 (PCPYUD + gpr128) | LD at 0x00113378 | 27,067 |
|
|
| **Post-Ch284 (LD)** | **SYSCALL $v1=0x40 at 0x00111D24** | **27,091** |
|
|
|
|
qbert is now executing through function returns. The next blocker is
|
|
**syscall #64** with `$a0 = 0x001DFFC0` (looks like a heap-top
|
|
address — possibly a memory-management or thread-context call) and
|
|
`$a1 = 0x0011C326`. Ch285 framing: add the 0x40 case to Ch273's
|
|
syscall HLE dispatcher (mirror the existing EndOfHeap / InitMainThread
|
|
/ FlushCache pattern). Open question for Codex: what is syscall 64?
|
|
The standard PS2 kernel syscall table is well-documented; Codex can
|
|
identify the exact service and the right stub-return semantics.
|
|
|
|
## Files changed
|
|
|
|
- `rtl/ee/ee_core_stub.sv` — 7 surgical edits (decode, alignment,
|
|
dispatch, multi-beat S_MEM_WAIT generalization).
|
|
- `sim/tb/integration/tb_ee_core_ld.sv` — new focused TB.
|
|
- `sim/Makefile` — target + both regression lists.
|
|
|
|
## Pattern review (14 chapters)
|
|
|
|
| Ch | Blocker | Edits | Pattern |
|
|
|-----|--------------|-------|---------|
|
|
| 271 | SQ | 5 | NEW 4-beat write |
|
|
| 272 | DADDU | 4 | NEW ALU-low-32 |
|
|
| 273 | SYSCALL HLE | 2 | NEW gated dispatcher |
|
|
| 274 | BEQL | 6 | NEW branch+squash |
|
|
| 275 | SD | 7 | REUSE SQ counter |
|
|
| 276 | DSLL | 4 | REUSE DADDU |
|
|
| 277 | BNEL | 6 | REUSE BEQL squash |
|
|
| 278 | PCPYLD | 4 | NEW MMI narrow-decode |
|
|
| 279 | LQ | 5 | REUSE LW path |
|
|
| 280 | PSUBB | 5 | REUSE MMI narrow (byte-SIMD new) |
|
|
| 281 | PNOR | 5 | REUSE MMI narrow + NOR arm |
|
|
| 282 | PAND | 5 | REUSE MMI narrow + AND arm |
|
|
| 283 | PCPYUD + gpr128 | architectural | NEW 128-bit shadow |
|
|
| **284** | **LD** | **7** | **REUSE Ch283 multi-beat path** |
|
|
|
|
Ch283's "one-time architectural investment" already paying off:
|
|
LD landed by extending the LQ/SQ/SD multi-beat machinery, not by
|
|
inventing new infrastructure. Future doubleword/multi-beat ops will
|
|
follow the same pattern.
|
|
|
|
## Regression
|
|
|
|
**172/172 PASS** (was 171/171 in Ch283).
|