Files
retroDE_ps2/docs/ch295_closeout.md
thejayman77 ec82764bef Initial commit: retroDE_ps2 — first-of-its-kind PS2 GS FPGA core (DE25-Nano / Agilex 5)
RTL (GS rasterizer, EE core stub, platform bridge, LPDDR4B path), sim regression
(272 TBs), docs, and tooling. Copyrighted PS2 content (BIOS, game code, GS dumps,
and all dump-derived textures/traces) is excluded via .gitignore and stays local.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-29 20:10:50 -04:00

6.9 KiB
Raw Permalink Blame History

Ch295 closeout — Strategy A worked: wait loop exited in one iteration

Status: Closed. Codex's Strategy A ($a0-aware experimental HLE patch) worked first try. Verdict from re-running qbert.elf: elf_first_unhandled_syscall (pc=0x00111D94 $v1=0x79 (=121)) — qbert exited the Ch294 wait loop after exactly one iteration and advanced into new code, hitting the next syscall blocker.

The Ch294 hypothesis confirmed

Ch294 diagnosed: qbert spins forever because syscall 0x7A($a0=4) returns 0, so (retval & 0x00020000) == 0 always — bit 17 never sets. Ch295 patched the HLE to return 0x00020000 when $a0 == 4.

Result: the wait loop iterated exactly once and exited. The runner observer's syscall_0x7A_split line tells the whole story:

syscall_0x7A_split = count_a0_4=1 count_a0_0x80000000=1 count_a0_other=1
                     last_a0=0x00000002
$a0 class Calls Match Ch294
0x80000000 (init) 1 yes — the call at PC 0x00112408 before the loop
0x00000004 (poll) 1 yes — the loop iterated exactly once and exited
other (= 2) 1 the post-loop call at PC 0x00112434 with $a0=2

Loop iterations dropped from 181,494 → 1. That's a 181k× collapse. Ch294's gate identification was exactly right.

What landed

rtl/ee/ee_core_stub.sv — $a0-aware HLE

32'h0000_007A: begin
    if (regfile[4] == 32'h0000_0004) begin
        regfile[2] <= 32'h0002_0000;
        gpr128[2]  <= {96'd0, 32'h0002_0000};
    end else begin
        regfile[2] <= 32'd0;
        gpr128[2]  <= 128'd0;
    end
    pc           <= pc + 32'd4;
    retire_pulse <= 1'b1;
    state        <= S_IFETCH_REQ;
end

The HLE branches on regfile[4] (= $a0). For $a0 == 4, return bit-17-set; otherwise return 0. Documented in the RTL comment as an experimental unblock — not architectural truth. If qbert misbranches downstream, this gets rolled back in favor of SDK semantics or interrupt-delivery modeling.

tb_ee_core_syscall_hle.sv — extended with the $a0=4 subcase

Six new BIOS slots (S_ORI_A0_4, S_ORI_V1_7A_4, S_SYS_7A_4, S_LUI_EXP_4, S_BNE_7A_4, S_DS_7A_4) cover the $a0=4 case:

ori   $a0, $0, 4         ; $a0 = 4
ori   $v1, $0, 0x7A      ; $v1 = 0x7A
syscall                   ; → $v0 = 0x00020000
lui   $t1, 0x2           ; $t1 = 0x00020000 (expected)
bne   $v0, $t1, FAIL     ; verify
nop

Plus a new latch (v0_after_7A_a0_4 / seen_7A_a0_4_return) + assertion + display field. Existing 0x7A subcase ($a0=0, $v0=0) unchanged. Result:

$v0_after_7A=0x00000000  $v0_after_7A_a0_4=0x00020000

tb_ee_core_elf_runner.sv — per-$a0-class counters

New syscall_0x7A_split SUMMARY line shows count_a0_4 / count_a0_0x80000000 / count_a0_other separately, plus first_v0_after and last_v0_after for the actual returned $v0 sampled one cycle after retire (after the NBA commits).

These counters are the key Ch295 instrumentation: at a glance you can see whether qbert's $a0-class distribution matches expectations and whether the wait loop is collapsing or still spinning.

qbert progression

Chapter Blocker retire_count Notes
Post-Ch293 (syscall 0x7A returns 0) wait-loop spin 1,661,413 (watchdog) hot_pc=0x0011242C
Post-Ch295 ($a0-aware 0x7A) syscall $v1=0x79 at 0x00111D94 27,996 hot_pc=0x00112354

The 1.66M → 27,996 retire-count drop is misleading on its own — the Ch293 number was a watchdog total that included 181k spinning loop iterations. The MEANINGFUL signal is:

  • Wait loop iterations: 181,494 → 1
  • Next blocker shape: from elf_timeout_with_hot_pc (no progress) → elf_first_unhandled_syscall (concrete next demand)

That's a clean phase change from "stuck" to "next problem."

Ch296 framing — syscall 0x79

The new blocker:

  • $v1 = 0x79 (= 121)
  • $a0 = 0x80000000 (kseg0 base — same as the 0x7A init call!)
  • $a1 = 0x00000000
  • $a2 = 0x00000000
  • $a3 = 0x001328C0 (same global context pointer)
  • PC = 0x00111D94

PS2 standard syscall table cites names like ResetEE (121) or similar in this slot. The arg shape ($a0 = kseg0 base, $a3 = ctx) suggests a cleanup/finalize call symmetric to one of the earlier init calls. Note PC 0x00111D94 is close to 0x00111D24 (the Ch289 syscall 0x78 site) — adjacent in the same kernel-wrapper neighborhood.

Per the Ch285/289/290/291/293 precedent: another narrow $v0=0 extension + runner observer for syscall 0x79. Probably one chapter. If qbert misbranches downstream, examine $a0/$a3 shapes for hints.

Notes on the experimental nature of Ch295

This chapter explicitly violates one principle: the HLE return value for syscall 0x7A is now a hardcoded answer to qbert's specific question, not a model of any real PS2 kernel state. If a different ELF calls syscall 0x7A($a0=4), it'll get bit 17 set unconditionally — which may or may not be correct for that ELF.

Codex framed this as acceptable for the falsifiable experiment: "if it advances meaningfully, Ch296 identifies what bit 17 represents." We did advance meaningfully. The semantic question ("what does bit 17 actually mean in real PS2 kernel state?") is deferred to whenever a second consumer of syscall 0x7A surfaces.

Risks logged:

  • A different ELF might call syscall 0x7A($a0=4) expecting bit 17 to be 0 (e.g., a "not ready yet" semantic). For qbert, "ready" = bit-17-set works. For other ELFs, the answer might differ.
  • If qbert's downstream code reads syscall 0x7A($a0=4) more than once per "event," we might see the same "ready" response too many times — possibly causing duplicate event handling.

The runner observer's count_a0_4=1 for qbert mitigates risk #2 for this specific run.

Files changed

  • rtl/ee/ee_core_stub.sv — 1 dispatcher case modified ($a0-aware branch, ~10 LOC delta).
  • sim/tb/integration/tb_ee_core_syscall_hle.sv — 6 new slots + 1 latch + 1 assertion + 1 display field.
  • sim/tb/integration/tb_ee_core_elf_runner.sv — 3 new counter signals + observer arm + SUMMARY line.

No new TB, no new Makefile target; regression count unchanged at 176/176.

Pattern review (25 chapters)

Ch Pattern Effect on qbert
286 EI / 292 SYNC narrow opcode accept --
287/288 DMAC MMIO new stubs unmapped_mmio → 0
285/289/290/291/293 syscall HLE narrow $v0=0 cases each unlocks +few retires to +1.6M
294 wait autopsy observation-only named the gate
295 experimental $a0-aware HLE falsifiable patch loop iterations: 181,494 → 1

Ch295 is the first chapter where the HLE return value is context-dependent rather than constant. The runner observer's per-arg-class split made this falsifiable: the count_a0_4=1 fact proves the patch worked, and the verdict shape change (timeout → unhandled_syscall) proves qbert progressed semantically.

Regression

176/176 PASS (unchanged from Ch294; no new TB).