Files
retroDE_ps2/docs/ch300_closeout.md
T
thejayman77 ec82764bef Initial commit: retroDE_ps2 — first-of-its-kind PS2 GS FPGA core (DE25-Nano / Agilex 5)
RTL (GS rasterizer, EE core stub, platform bridge, LPDDR4B path), sim regression
(272 TBs), docs, and tooling. Copyrighted PS2 content (BIOS, game code, GS dumps,
and all dump-derived textures/traces) is excluded via .gitignore and stays local.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-29 20:10:50 -04:00

4.7 KiB

Ch300 closeout — MMI3 PCPYH; another adjacent-syscall surfaces

Status: Closed. Codex's PCPYH semantics (sa=0x1B, $rs-ignored, broadcast low halfword of each $rt doubleword) implemented and tested. Verdict from re-running qbert.elf: elf_first_unhandled_syscall (pc=0x00112A84 $v1=0x17 (=23)). qbert advanced 28,655 → 28,708 retires (+53) through PCPYH and into another syscall.

What landed — rtl/ee/ee_core_stub.sv

Five surgical edits via the Ch278/281/283 MMI narrow-decode pattern:

  1. localparam MMI3_PCPYH = 5'h1B alongside MMI3_PCPYUD (0x0E).
  2. is_pcpyh decode flag.
  3. is_pcpyh added to is_rtype_alu and is_mmi_wb; !is_pcpyh added to MMI nop_class exclusion.
  4. Low-32 mirror arm: rtype_alu_wb = {rt128_val[15:0], rt128_val[15:0]} — broadcasts h0 across the low 32 (the regfile mirror sees {h0,h0}).
  5. Full-128 writeback: rtype_alu128_wb = {{4{rt128_val[79:64]}}, {4{rt128_val[15:0]}}} — broadcasts h0 across low 64 lanes and h4 across high 64 lanes. Exactly Codex's spec.

$rs is architecturally ignored — the decode uses opcode+funct+sa only, no rs check. The TB's Case 2 verifies this.

Focused TB — tb_ee_core_pcpyh.sv

Three cases:

  1. Exact qbert encoding asserted == 0x70081EE9. Seeds gpr128[$t0] via PCPYLD($t0, $t1, $t2) where $t1 low 16 = 0xABCD (→ h4) and $t2 low 16 = 0x1234 (→ h0). Then PCPYH $v1, $t0. Verified:
    • regfile[$v1] = 0x12341234
    • gpr128[$v1][63:0] = 0x1234_1234_1234_1234
    • gpr128[$v1][127:64] = 0xABCD_ABCD_ABCD_ABCD
  2. $rs-ignored check: PCPYH $t3, $t0 with rs=$v1 (non-zero). Asserts gpr128[$t3] == gpr128[$v1] (same full 128-bit result; $rs change has no effect).
  3. Narrow decode: neighbor MMI3 sa=0x1C (unallocated) still traps under strict mode.

Result: retired=16 halt=0 trap=1 errors=0 PASS. The TB also verifies the full SUMMARY line shows the broadcast pattern in hex.

Makefile + regression

  • tb_ee_core_pcpyh target.
  • Added to both PHONY list and run: master list.
  • Regression: 176 → 177.

qbert progression

Chapter Blocker retire_count
Post-Ch299 (gate poke) MMI3 PCPYH at 0x00110BB4 28,655
Post-Ch300 (PCPYH) syscall $v1=0x17 at 0x00112A84 28,708 (+53)

Small advance (+53) because qbert went immediately from the PCPYH into the next syscall. The new blocker is in a different code region (PC 0x00112A84 is near earlier syscall sites — close to the Ch289 0x78 area at 0x00112AA4).

Ch301 framing — syscall 0x17

$v1 = 0x17 (= 23)
$a0 = 0x00000005  (channel id 5, same as Ch290/291)
$a1 = 0x00000000
$a2 = 0xFFFFFFFF  (-1, sentinel?)
$a3 = 0x00137568  (NEW context pointer, NOT the global ctx 0x001328C0)

Notable shifts from earlier syscalls:

  • $a3 has CHANGED again: previously 0x001328C0 (global ctx), then 0x00137568 (different region — looks like a per-channel state buffer? same low byte as the Ch299 halt's $a0=0x00137540).
  • $a0 = 5 matches the channel id used in Ch290/291 (the DMAC handler-install pair). So qbert is doing channel-5-specific cleanup or query.
  • $a2 = -1 is unusual — often a "no filter" or "all" sentinel.

PS2 syscall 0x17 (= 23) in standard tables is commonly cited as SetVTLBRefillHandler or iWakeupThread or similar. The $a0=channel pattern fits a per-channel kernel call.

Mechanical recipe: 9th narrow $v0=0 case in the dispatcher + runner observer with full arg snapshot. Standard Ch289-pattern extension.

Pattern review (30 chapters)

Era Chapters Effect
Opcode-blocker Ch271..Ch286 R5900 opcodes
MMIO stubs Ch287..Ch288 DMAC ctrl + per-channel
Syscall HLE narrow Ch273/285/289/290/291/293/296/297 $v0=0 narrow cases
Narrow NOP-class Ch286/292 side-effect-free accepts
Inflection #1 Ch293 first wait loop
Investigation #1 Ch294 bit-17 syscall poll
Experimental unblock #1 Ch295 $a0-aware HLE
Inflection #2 Ch297 second wait loop
Investigation #2 Ch298 memory poll at 0x001329C0
Experimental unblock #2 Ch299 TB-side gate poke
MMI op Ch300 (PCPYH) mechanical MMI extension

The chapter cadence is now well-mixed: opcode chapters, MMIO chapters, syscall HLE chapters, narrow NOP-class chapters, and investigation/unblock 3-chapter cycles. All productive.

Files changed

  • rtl/ee/ee_core_stub.sv — 5 surgical edits (localparam, decode flag, is_rtype_alu/is_mmi_wb/nop_class wiring, two writeback arms).
  • sim/tb/integration/tb_ee_core_pcpyh.sv — new focused TB.
  • sim/Makefile — target + both regression lists.

Regression

177/177 PASS (was 176 in Ch299; +1 for the new tb_ee_core_pcpyh).