Files
retroDE_ps2/docs/ch286_closeout.md
T
thejayman77 ec82764bef Initial commit: retroDE_ps2 — first-of-its-kind PS2 GS FPGA core (DE25-Nano / Agilex 5)
RTL (GS rasterizer, EE core stub, platform bridge, LPDDR4B path), sim regression
(272 TBs), docs, and tooling. Copyrighted PS2 content (BIOS, game code, GS dumps,
and all dump-derived textures/traces) is excluded via .gitignore and stays local.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-29 20:10:50 -04:00

157 lines
6.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Ch286 closeout — narrow EI accept; verdict shape flips to unmapped-MMIO
**Status:** Closed. **Verdict from re-running qbert.elf:**
`elf_first_unmapped_mmio (ea=0x1000E010 pc=0x001123A8)`. qbert
advanced 27,239 → **27,907 retires (+668)** — back-to-back +148 then
+668, the largest two consecutive jumps on the qbert track.
**Verdict shape changed for the first time.** Every prior chapter
hit `elf_first_unsupported_opcode` or `elf_first_unhandled_syscall`.
Ch286 closes out the opcode-by-opcode era for qbert: the next
blocker is a device-side MMIO access, not a missing instruction.
qbert has graduated to "talking to hardware."
## What landed
A narrow exact-32-bit decode of R5900 `EI` at 0x42000038 — and
nothing else. Per Codex's framing principle ("decode the EXACT
32-bit instruction, do NOT NOP-class all COP0/CO"):
```sv
localparam logic [31:0] EI_INSTR_R5900 = 32'h4200_0038;
...
assign is_ei = (instr == EI_INSTR_R5900);
...
|| (is_cop0 && !is_mfc0 && !is_mtc0 && !is_rfe && !is_ei)
```
3 RTL edits. The decode falls through every recognized arm in the
S_EXECUTE block, hits the `else begin` default execute path. None
of the writeback predicates match, so no GPR is touched. The path
still calls `retire_advance()` (PC += 4) and goes back to
S_IFETCH_REQ. Exactly the "side-effect-free retire" Codex specified.
The companion `DI` at 0x42000039 is left trapping under strict
mode; the next ELF that needs it will add a one-line decode in its
chapter.
## TB — `tb_ee_core_ei.sv`
Verifies all three correctness properties simultaneously:
1. **Retire happens at all** — a latch keyed on
`u_core.retired_pc == B_EI_slot_PC` captures `seen_ei_retire = 1`
and snapshots `$v0`/`$t0` at that exact cycle.
2. **EI is side-effect-free** — the snapshot shows $v0=SENTINEL_A,
$t0=SENTINEL_B unchanged from the LUI+ORI setup. End-of-sim
confirms they're still those values.
3. **Decode is narrow** — DI (0x42000039) placed immediately after
EI must trap. The TB asserts `core_trap_events == 1`,
`trap_pc == DI slot`, `trap_instr == 0x42000039`. If the EI
decode had been widened (e.g. `is_cop0 && rs==CO && funct[5:1] ==
5'b11100`), DI would have been accepted too.
4. **Post-EI code runs** — $t1=SENTINEL_C end-of-sim proves the
LUI+ORI sequence after EI executed (i.e. EI didn't halt the core).
Result: `retired=10 halt=0 trap=1 errors=0 PASS`.
## Makefile + regression
- `tb_ee_core_ei` target.
- Added to both PHONY list and `run:` master list.
- Regression: 172 → **173**.
## qbert progression
| Chapter | Blocker | retire_count |
|---|---|---|
| Post-Ch284 (LD) | syscall $v1=0x40 | 27,091 |
| Post-Ch285 (syscall 0x40) | EI (0x42000038) | 27,239 |
| **Post-Ch286 (EI)** | **unmapped MMIO 0x1000E010 at PC 0x001123A8** | **27,907** |
Back-to-back +148, +668. qbert is past the init phase and into
mainline game code — the +668 retires after EI included whatever
post-init setup qbert does (clearing buffers, building tables,
initial DMAC config) before hitting a DMAC register read at
0x1000E010.
## Ch287 framing — first DMAC MMIO touch
EA `0x1000E010` decodes to the EE DMAC control register region:
| Address | Reg | Purpose |
|--------------|--------------|---------|
| 0x1000E000 | D_CTRL | DMAC enable / cycle-stealing config |
| **0x1000E010** | **D_STAT** | **DMAC interrupt status (per-channel CIS + per-channel CIM)** |
| 0x1000E020 | D_PCR | Per-channel priority + W1C enable |
| 0x1000E030 | D_SQWC | Stall/skip cycles |
| 0x1000E040 | D_RBSR | Ring-buffer size |
| 0x1000E050 | D_RBOR | Ring-buffer base |
PC 0x001123A8 reading D_STAT during init is the standard PS2 game
pattern: "clear any pending DMAC channel-completion bits before we
start." A minimal stub:
- D_STAT reads return 0 (no pending interrupts in our model).
- D_STAT writes are W1C (write-1-clears); accept and discard.
- D_CTRL/D_PCR/D_SQWC/D_RBSR/D_RBOR: accept any write, return last
written value on read.
The runner's hot_pc=0x00112350 with count=29 suggests qbert is
sitting in a polling loop waiting on D_STAT — the loop won't exit
until reads return the expected bits. So Ch287 needs at least
enough state to make the polling loop terminate.
For Codex to frame: is the right answer (a) a new
`ee_dmac_ctrl_mmio_stub.sv` parallel to `ee_dmac_ch2_*`, or (b)
extend the existing DMAC channel stubs to cover the control regs,
or (c) widen `ee_memory_map_stub` to silently accept the
0x1000E000-0x1000EFFF region with read-as-zero / write-discarded
defaults until a specific behavior is needed?
I lean (c) for the first pass — Ch263 established that adding
silent accept regions is the standard way to advance past a
"first-touch" MMIO blocker without committing to full device
modeling. The pattern: when a read returns 0, the polling loop
*should* exit because "no pending interrupt" is the natural quiet
state.
But Codex may have a stronger view; the DMAC is heavily used by
qbert downstream (every CH GIF transfer goes through it), so a
proper stub may be warranted now rather than incrementally.
## Files changed
- `rtl/ee/ee_core_stub.sv` — 3 edits (localparam, decode, nop-class
exclusion).
- `sim/tb/integration/tb_ee_core_ei.sv` — new focused TB.
- `sim/Makefile` — target + both regression lists.
## Pattern review (16 chapters)
| Ch | Blocker | Edits | Pattern |
|-----|--------------|-------|---------|
| 271 | SQ | 5 | NEW 4-beat write |
| 272 | DADDU | 4 | NEW ALU-low-32 |
| 273 | SYSCALL HLE | 2 | NEW gated dispatcher |
| 274 | BEQL | 6 | NEW branch+squash |
| 275 | SD | 7 | REUSE SQ counter |
| 276 | DSLL | 4 | REUSE DADDU |
| 277 | BNEL | 6 | REUSE BEQL squash |
| 278 | PCPYLD | 4 | NEW MMI narrow-decode |
| 279 | LQ | 5 | REUSE LW path |
| 280 | PSUBB | 5 | REUSE MMI narrow |
| 281 | PNOR | 5 | REUSE MMI narrow + NOR arm |
| 282 | PAND | 5 | REUSE MMI narrow + AND arm |
| 283 | PCPYUD + gpr128 | architectural | NEW 128-bit shadow |
| 284 | LD | 5 | REUSE Ch283 multi-beat path |
| 285 | syscall 0x40 | ~1 | REUSE Ch273 dispatcher |
| **286** | **EI** | **3** | **NEW narrow exact-32 decode** |
The Ch271..Ch286 stretch took qbert from 12 retires (entry) to
27,907 — a 2,326× advance through 16 chapters. With Ch286 the
opcode era closes; Ch287 opens the MMIO era.
## Regression
In flight; expected **173/173**.