RTL (GS rasterizer, EE core stub, platform bridge, LPDDR4B path), sim regression (272 TBs), docs, and tooling. Copyrighted PS2 content (BIOS, game code, GS dumps, and all dump-derived textures/traces) is excluded via .gitignore and stays local. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
6.9 KiB
Ch295 closeout — Strategy A worked: wait loop exited in one iteration
Status: Closed. Codex's Strategy A ($a0-aware experimental HLE
patch) worked first try. Verdict from re-running qbert.elf:
elf_first_unhandled_syscall (pc=0x00111D94 $v1=0x79 (=121)) —
qbert exited the Ch294 wait loop after exactly one iteration and
advanced into new code, hitting the next syscall blocker.
The Ch294 hypothesis confirmed
Ch294 diagnosed: qbert spins forever because syscall 0x7A($a0=4)
returns 0, so (retval & 0x00020000) == 0 always — bit 17 never
sets. Ch295 patched the HLE to return 0x00020000 when $a0 == 4.
Result: the wait loop iterated exactly once and exited. The
runner observer's syscall_0x7A_split line tells the whole story:
syscall_0x7A_split = count_a0_4=1 count_a0_0x80000000=1 count_a0_other=1
last_a0=0x00000002
| $a0 class | Calls | Match Ch294 |
|---|---|---|
| 0x80000000 (init) | 1 | yes — the call at PC 0x00112408 before the loop |
| 0x00000004 (poll) | 1 | yes — the loop iterated exactly once and exited |
| other (= 2) | 1 | the post-loop call at PC 0x00112434 with $a0=2 |
Loop iterations dropped from 181,494 → 1. That's a 181k× collapse. Ch294's gate identification was exactly right.
What landed
rtl/ee/ee_core_stub.sv — $a0-aware HLE
32'h0000_007A: begin
if (regfile[4] == 32'h0000_0004) begin
regfile[2] <= 32'h0002_0000;
gpr128[2] <= {96'd0, 32'h0002_0000};
end else begin
regfile[2] <= 32'd0;
gpr128[2] <= 128'd0;
end
pc <= pc + 32'd4;
retire_pulse <= 1'b1;
state <= S_IFETCH_REQ;
end
The HLE branches on regfile[4] (= $a0). For $a0 == 4, return
bit-17-set; otherwise return 0. Documented in the RTL comment as an
experimental unblock — not architectural truth. If qbert
misbranches downstream, this gets rolled back in favor of SDK
semantics or interrupt-delivery modeling.
tb_ee_core_syscall_hle.sv — extended with the $a0=4 subcase
Six new BIOS slots (S_ORI_A0_4, S_ORI_V1_7A_4, S_SYS_7A_4,
S_LUI_EXP_4, S_BNE_7A_4, S_DS_7A_4) cover the $a0=4 case:
ori $a0, $0, 4 ; $a0 = 4
ori $v1, $0, 0x7A ; $v1 = 0x7A
syscall ; → $v0 = 0x00020000
lui $t1, 0x2 ; $t1 = 0x00020000 (expected)
bne $v0, $t1, FAIL ; verify
nop
Plus a new latch (v0_after_7A_a0_4 / seen_7A_a0_4_return) +
assertion + display field. Existing 0x7A subcase ($a0=0, $v0=0)
unchanged. Result:
$v0_after_7A=0x00000000 $v0_after_7A_a0_4=0x00020000
tb_ee_core_elf_runner.sv — per-$a0-class counters
New syscall_0x7A_split SUMMARY line shows count_a0_4 /
count_a0_0x80000000 / count_a0_other separately, plus
first_v0_after and last_v0_after for the actual returned $v0
sampled one cycle after retire (after the NBA commits).
These counters are the key Ch295 instrumentation: at a glance you can see whether qbert's $a0-class distribution matches expectations and whether the wait loop is collapsing or still spinning.
qbert progression
| Chapter | Blocker | retire_count | Notes |
|---|---|---|---|
| Post-Ch293 (syscall 0x7A returns 0) | wait-loop spin | 1,661,413 (watchdog) | hot_pc=0x0011242C |
| Post-Ch295 ($a0-aware 0x7A) | syscall $v1=0x79 at 0x00111D94 | 27,996 | hot_pc=0x00112354 |
The 1.66M → 27,996 retire-count drop is misleading on its own — the Ch293 number was a watchdog total that included 181k spinning loop iterations. The MEANINGFUL signal is:
- Wait loop iterations: 181,494 → 1
- Next blocker shape: from
elf_timeout_with_hot_pc(no progress) →elf_first_unhandled_syscall(concrete next demand)
That's a clean phase change from "stuck" to "next problem."
Ch296 framing — syscall 0x79
The new blocker:
$v1 = 0x79(= 121)$a0 = 0x80000000(kseg0 base — same as the 0x7A init call!)$a1 = 0x00000000$a2 = 0x00000000$a3 = 0x001328C0(same global context pointer)- PC =
0x00111D94
PS2 standard syscall table cites names like ResetEE (121) or
similar in this slot. The arg shape ($a0 = kseg0 base, $a3 = ctx)
suggests a cleanup/finalize call symmetric to one of the earlier
init calls. Note PC 0x00111D94 is close to 0x00111D24 (the
Ch289 syscall 0x78 site) — adjacent in the same kernel-wrapper
neighborhood.
Per the Ch285/289/290/291/293 precedent: another narrow $v0=0 extension + runner observer for syscall 0x79. Probably one chapter. If qbert misbranches downstream, examine $a0/$a3 shapes for hints.
Notes on the experimental nature of Ch295
This chapter explicitly violates one principle: the HLE return value for syscall 0x7A is now a hardcoded answer to qbert's specific question, not a model of any real PS2 kernel state. If a different ELF calls syscall 0x7A($a0=4), it'll get bit 17 set unconditionally — which may or may not be correct for that ELF.
Codex framed this as acceptable for the falsifiable experiment: "if it advances meaningfully, Ch296 identifies what bit 17 represents." We did advance meaningfully. The semantic question ("what does bit 17 actually mean in real PS2 kernel state?") is deferred to whenever a second consumer of syscall 0x7A surfaces.
Risks logged:
- A different ELF might call syscall 0x7A($a0=4) expecting bit 17 to be 0 (e.g., a "not ready yet" semantic). For qbert, "ready" = bit-17-set works. For other ELFs, the answer might differ.
- If qbert's downstream code reads syscall 0x7A($a0=4) more than once per "event," we might see the same "ready" response too many times — possibly causing duplicate event handling.
The runner observer's count_a0_4=1 for qbert mitigates risk #2
for this specific run.
Files changed
rtl/ee/ee_core_stub.sv— 1 dispatcher case modified ($a0-aware branch, ~10 LOC delta).sim/tb/integration/tb_ee_core_syscall_hle.sv— 6 new slots + 1 latch + 1 assertion + 1 display field.sim/tb/integration/tb_ee_core_elf_runner.sv— 3 new counter signals + observer arm + SUMMARY line.
No new TB, no new Makefile target; regression count unchanged at 176/176.
Pattern review (25 chapters)
| Ch | Pattern | Effect on qbert |
|---|---|---|
| 286 EI / 292 SYNC | narrow opcode accept | -- |
| 287/288 DMAC MMIO | new stubs | unmapped_mmio → 0 |
| 285/289/290/291/293 syscall HLE | narrow $v0=0 cases | each unlocks +few retires to +1.6M |
| 294 wait autopsy | observation-only | named the gate |
| 295 experimental $a0-aware HLE | falsifiable patch | loop iterations: 181,494 → 1 |
Ch295 is the first chapter where the HLE return value is context-dependent rather than constant. The runner observer's per-arg-class split made this falsifiable: the count_a0_4=1 fact proves the patch worked, and the verdict shape change (timeout → unhandled_syscall) proves qbert progressed semantically.
Regression
176/176 PASS (unchanged from Ch294; no new TB).