ec82764bef
RTL (GS rasterizer, EE core stub, platform bridge, LPDDR4B path), sim regression (272 TBs), docs, and tooling. Copyrighted PS2 content (BIOS, game code, GS dumps, and all dump-derived textures/traces) is excluded via .gitignore and stays local. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
157 lines
6.4 KiB
Markdown
157 lines
6.4 KiB
Markdown
# Ch286 closeout — narrow EI accept; verdict shape flips to unmapped-MMIO
|
||
|
||
**Status:** Closed. **Verdict from re-running qbert.elf:**
|
||
`elf_first_unmapped_mmio (ea=0x1000E010 pc=0x001123A8)`. qbert
|
||
advanced 27,239 → **27,907 retires (+668)** — back-to-back +148 then
|
||
+668, the largest two consecutive jumps on the qbert track.
|
||
|
||
**Verdict shape changed for the first time.** Every prior chapter
|
||
hit `elf_first_unsupported_opcode` or `elf_first_unhandled_syscall`.
|
||
Ch286 closes out the opcode-by-opcode era for qbert: the next
|
||
blocker is a device-side MMIO access, not a missing instruction.
|
||
qbert has graduated to "talking to hardware."
|
||
|
||
## What landed
|
||
|
||
A narrow exact-32-bit decode of R5900 `EI` at 0x42000038 — and
|
||
nothing else. Per Codex's framing principle ("decode the EXACT
|
||
32-bit instruction, do NOT NOP-class all COP0/CO"):
|
||
|
||
```sv
|
||
localparam logic [31:0] EI_INSTR_R5900 = 32'h4200_0038;
|
||
...
|
||
assign is_ei = (instr == EI_INSTR_R5900);
|
||
...
|
||
|| (is_cop0 && !is_mfc0 && !is_mtc0 && !is_rfe && !is_ei)
|
||
```
|
||
|
||
3 RTL edits. The decode falls through every recognized arm in the
|
||
S_EXECUTE block, hits the `else begin` default execute path. None
|
||
of the writeback predicates match, so no GPR is touched. The path
|
||
still calls `retire_advance()` (PC += 4) and goes back to
|
||
S_IFETCH_REQ. Exactly the "side-effect-free retire" Codex specified.
|
||
|
||
The companion `DI` at 0x42000039 is left trapping under strict
|
||
mode; the next ELF that needs it will add a one-line decode in its
|
||
chapter.
|
||
|
||
## TB — `tb_ee_core_ei.sv`
|
||
|
||
Verifies all three correctness properties simultaneously:
|
||
|
||
1. **Retire happens at all** — a latch keyed on
|
||
`u_core.retired_pc == B_EI_slot_PC` captures `seen_ei_retire = 1`
|
||
and snapshots `$v0`/`$t0` at that exact cycle.
|
||
2. **EI is side-effect-free** — the snapshot shows $v0=SENTINEL_A,
|
||
$t0=SENTINEL_B unchanged from the LUI+ORI setup. End-of-sim
|
||
confirms they're still those values.
|
||
3. **Decode is narrow** — DI (0x42000039) placed immediately after
|
||
EI must trap. The TB asserts `core_trap_events == 1`,
|
||
`trap_pc == DI slot`, `trap_instr == 0x42000039`. If the EI
|
||
decode had been widened (e.g. `is_cop0 && rs==CO && funct[5:1] ==
|
||
5'b11100`), DI would have been accepted too.
|
||
4. **Post-EI code runs** — $t1=SENTINEL_C end-of-sim proves the
|
||
LUI+ORI sequence after EI executed (i.e. EI didn't halt the core).
|
||
|
||
Result: `retired=10 halt=0 trap=1 errors=0 PASS`.
|
||
|
||
## Makefile + regression
|
||
|
||
- `tb_ee_core_ei` target.
|
||
- Added to both PHONY list and `run:` master list.
|
||
- Regression: 172 → **173**.
|
||
|
||
## qbert progression
|
||
|
||
| Chapter | Blocker | retire_count |
|
||
|---|---|---|
|
||
| Post-Ch284 (LD) | syscall $v1=0x40 | 27,091 |
|
||
| Post-Ch285 (syscall 0x40) | EI (0x42000038) | 27,239 |
|
||
| **Post-Ch286 (EI)** | **unmapped MMIO 0x1000E010 at PC 0x001123A8** | **27,907** |
|
||
|
||
Back-to-back +148, +668. qbert is past the init phase and into
|
||
mainline game code — the +668 retires after EI included whatever
|
||
post-init setup qbert does (clearing buffers, building tables,
|
||
initial DMAC config) before hitting a DMAC register read at
|
||
0x1000E010.
|
||
|
||
## Ch287 framing — first DMAC MMIO touch
|
||
|
||
EA `0x1000E010` decodes to the EE DMAC control register region:
|
||
|
||
| Address | Reg | Purpose |
|
||
|--------------|--------------|---------|
|
||
| 0x1000E000 | D_CTRL | DMAC enable / cycle-stealing config |
|
||
| **0x1000E010** | **D_STAT** | **DMAC interrupt status (per-channel CIS + per-channel CIM)** |
|
||
| 0x1000E020 | D_PCR | Per-channel priority + W1C enable |
|
||
| 0x1000E030 | D_SQWC | Stall/skip cycles |
|
||
| 0x1000E040 | D_RBSR | Ring-buffer size |
|
||
| 0x1000E050 | D_RBOR | Ring-buffer base |
|
||
|
||
PC 0x001123A8 reading D_STAT during init is the standard PS2 game
|
||
pattern: "clear any pending DMAC channel-completion bits before we
|
||
start." A minimal stub:
|
||
- D_STAT reads return 0 (no pending interrupts in our model).
|
||
- D_STAT writes are W1C (write-1-clears); accept and discard.
|
||
- D_CTRL/D_PCR/D_SQWC/D_RBSR/D_RBOR: accept any write, return last
|
||
written value on read.
|
||
|
||
The runner's hot_pc=0x00112350 with count=29 suggests qbert is
|
||
sitting in a polling loop waiting on D_STAT — the loop won't exit
|
||
until reads return the expected bits. So Ch287 needs at least
|
||
enough state to make the polling loop terminate.
|
||
|
||
For Codex to frame: is the right answer (a) a new
|
||
`ee_dmac_ctrl_mmio_stub.sv` parallel to `ee_dmac_ch2_*`, or (b)
|
||
extend the existing DMAC channel stubs to cover the control regs,
|
||
or (c) widen `ee_memory_map_stub` to silently accept the
|
||
0x1000E000-0x1000EFFF region with read-as-zero / write-discarded
|
||
defaults until a specific behavior is needed?
|
||
|
||
I lean (c) for the first pass — Ch263 established that adding
|
||
silent accept regions is the standard way to advance past a
|
||
"first-touch" MMIO blocker without committing to full device
|
||
modeling. The pattern: when a read returns 0, the polling loop
|
||
*should* exit because "no pending interrupt" is the natural quiet
|
||
state.
|
||
|
||
But Codex may have a stronger view; the DMAC is heavily used by
|
||
qbert downstream (every CH GIF transfer goes through it), so a
|
||
proper stub may be warranted now rather than incrementally.
|
||
|
||
## Files changed
|
||
|
||
- `rtl/ee/ee_core_stub.sv` — 3 edits (localparam, decode, nop-class
|
||
exclusion).
|
||
- `sim/tb/integration/tb_ee_core_ei.sv` — new focused TB.
|
||
- `sim/Makefile` — target + both regression lists.
|
||
|
||
## Pattern review (16 chapters)
|
||
|
||
| Ch | Blocker | Edits | Pattern |
|
||
|-----|--------------|-------|---------|
|
||
| 271 | SQ | 5 | NEW 4-beat write |
|
||
| 272 | DADDU | 4 | NEW ALU-low-32 |
|
||
| 273 | SYSCALL HLE | 2 | NEW gated dispatcher |
|
||
| 274 | BEQL | 6 | NEW branch+squash |
|
||
| 275 | SD | 7 | REUSE SQ counter |
|
||
| 276 | DSLL | 4 | REUSE DADDU |
|
||
| 277 | BNEL | 6 | REUSE BEQL squash |
|
||
| 278 | PCPYLD | 4 | NEW MMI narrow-decode |
|
||
| 279 | LQ | 5 | REUSE LW path |
|
||
| 280 | PSUBB | 5 | REUSE MMI narrow |
|
||
| 281 | PNOR | 5 | REUSE MMI narrow + NOR arm |
|
||
| 282 | PAND | 5 | REUSE MMI narrow + AND arm |
|
||
| 283 | PCPYUD + gpr128 | architectural | NEW 128-bit shadow |
|
||
| 284 | LD | 5 | REUSE Ch283 multi-beat path |
|
||
| 285 | syscall 0x40 | ~1 | REUSE Ch273 dispatcher |
|
||
| **286** | **EI** | **3** | **NEW narrow exact-32 decode** |
|
||
|
||
The Ch271..Ch286 stretch took qbert from 12 retires (entry) to
|
||
27,907 — a 2,326× advance through 16 chapters. With Ch286 the
|
||
opcode era closes; Ch287 opens the MMIO era.
|
||
|
||
## Regression
|
||
|
||
In flight; expected **173/173**.
|