RTL (GS rasterizer, EE core stub, platform bridge, LPDDR4B path), sim regression (272 TBs), docs, and tooling. Copyrighted PS2 content (BIOS, game code, GS dumps, and all dump-derived textures/traces) is excluded via .gitignore and stays local. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
6.1 KiB
Ch297 closeout — syscall 0x77 HLE; richer observer pays off; another wait loop surfaces
Status: Closed. Verdict from re-running qbert.elf:
elf_timeout_with_hot_pc (watchdog after 50000000 ns, 1469235 retires, hot_pc=0x00112554 count=26/256) — qbert advanced
28,101 → 1,469,235 retires (+1,441,134), then hit another
steady-state wait loop at a NEW hot_pc.
This is the second time the runner has surfaced elf_timeout_with_hot_pc
on qbert (after Ch293). Pattern is repeating from Ch293→Ch294:
mechanical syscall HLE chapter unlocks a big advance, then a new
wait loop surfaces requiring investigation.
What landed
Dispatcher case — rtl/ee/ee_core_stub.sv
8th narrow $v0=0 case in the Ch273 dispatcher:
32'h0000_0077: begin
regfile[2] <= 32'd0;
gpr128[2] <= 128'd0;
pc <= pc + 32'd4;
retire_pulse <= 1'b1;
state <= S_IFETCH_REQ;
end
TB extension — tb_ee_core_syscall_hle.sv
Standard 4-slot subcase. The TB now covers nine known syscall numbers plus the unknown-halt path. All assertions pass.
Runner observer — RICHER than prior observers
Per Codex's framing, the 0x77 observer captures more than just "first call" — it tracks up to 4 distinct ($a0,$a1,$a2,$a3) tuples with per-tuple count. Implementation:
logic syscall_0x77_tuple_valid [0:3];
logic [31:0] syscall_0x77_tuple_a0 [0:3];
... (a1, a2, a3)
int syscall_0x77_tuple_count [0:3];
int syscall_0x77_distinct_tuples;
On every syscall 0x77 retire, the observer:
- Bumps total count.
- Snapshots first/last args.
- Looks up the current ($a0,$a1,$a2,$a3) tuple in the table. If found, increments its count. If not found and a slot is free, records the new tuple.
This means: at end-of-sim, the SUMMARY block shows whether qbert
made the same call repeatedly (count > 1 with distinct_tuples = 1) or iterated over a table (count > 1 with distinct_tuples > 1,
with per-tuple counts visible).
Cost: ~50 LOC. Value: decisive answer to "is qbert calling this syscall in a loop?"
The qbert run's smoking gun
syscall_0x77 = count=2 distinct_tuples=2
tuple[0] = ($a0=0x001dfd50, $a1=1, $a2=0, $a3=20) count=1
tuple[1] = ($a0=0x001dfdb0, $a1=1, $a2=0, $a3=16) count=1
Two distinct calls. The arg-pattern is striking:
$a0increments by 0x60 = 96 bytes (0x001dfd50 → 0x001dfdb0).$a3is a count: 20 then 16.$a1 = 1and$a2 = 0constant across calls.
This shape strongly fits a registration-iteration call:
$a0= base address of registration record (heap-resident buffer at 0x001dfd50, then a second record 96 bytes later).$a1 = 1= mode flag (constant).$a3= number of entries in the record (20 first, 16 second).
PS2 standard references for syscall 0x77 (= 119) cite plausible
names like RegisterLibraryEntries or similar — both consistent
with this 4-tuple shape.
qbert progression
| Chapter | retire_count | Verdict | Note |
|---|---|---|---|
| Post-Ch296 (0x79) | 28,101 | elf_first_unhandled_syscall | $v1=0x77 |
| Post-Ch297 (0x77) | 1,469,235 | elf_timeout_with_hot_pc | new wait loop at hot_pc=0x00112554 |
+1.44M retire jump. Comparable to Ch293's inflection 60× jump. qbert is back in steady-state-loop territory at a different hot_pc. This is Ch298 investigation territory.
Cross-observation: syscall 0x7A traffic changed too
syscall_0x7A = count=4 (was 3 in Ch295/Ch296)
syscall_0x7A_split = count_a0_4=1 count_a0_0x80000000=1 count_a0_other=2 (was 1)
last_a0=0x80000002
qbert called 0x7A four times this run vs three times in Ch295/296. The extra call is in the "other" bucket ($a0=0x80000002 — close to but not equal to 0x80000000 or 4).
So syscall 0x7A is being used with more arg shapes as qbert
progresses further. The Ch295 $a0-aware fix is not generalizing
correctly: $a0=0x80000002 takes the else path and returns 0,
which may or may not be what qbert expects. Worth keeping in mind
for downstream debugging.
Ch298 framing — investigation of the new wait loop
Hot_pc = 0x00112554 with count = 26/256. This is NOT 0x0011242C (Ch293's hot_pc), so it's a different wait loop. Ch298 should mirror Ch294's autopsy approach:
- Disassemble 0x00112540..0x001125A0 (~24 instructions around the new hot_pc).
- Classify reads/writes in that PC window from the trace file.
- Identify the branch condition.
- Pick one of Codex's verdicts:
qbert_waiting_on_dmac_handlerqbert_waiting_on_vblankqbert_waiting_on_thread_schedulerqbert_waiting_on_memory_flag(likely, by analogy with Ch294)qbert_wait_loop_unknown
The richer-observer pattern's tuple machinery is reusable for
ANY future investigation chapter — it can be retargeted at
whatever syscall or function the new loop polls.
Pattern review (27 chapters)
| Phase | Effect |
|---|---|
| Opcode-blocker | Ch271..Ch286 |
| MMIO stubs | Ch287..Ch288 |
| Syscall HLE narrow | Ch273/285/289/290/291/293/296/297 |
| Narrow NOP-class | Ch286/292 |
| Inflection #1 | Ch293 — first wait loop surfaces |
| Investigation #1 | Ch294 — bit-17 polled flag identified |
| Experimental unblock | Ch295 — $a0-aware HLE |
| Inflection #2 | Ch297 — second wait loop surfaces |
| (Investigation #2) | Ch298 — autopsy required |
The Ch293→Ch294→Ch295 cycle (inflection → autopsy → unblock) took 3 chapters and resulted in a 60× retire-count jump. Ch297 has surfaced an inflection of comparable magnitude (+1.44M retires). Ch298 should be the analogous autopsy.
Files changed
rtl/ee/ee_core_stub.sv— 1 new HLE case (~25 LOC with comment).sim/tb/integration/tb_ee_core_syscall_hle.sv— 4 new slots + 1 latch + 1 assertion + 1 display field.sim/tb/integration/tb_ee_core_elf_runner.sv— 6 new state signals + observer block with distinct-tuple table + SUMMARY display lines.
No new TB, no new Makefile target; regression count unchanged at 176/176.
Regression
176/176 PASS (unchanged from Ch296; no new TB).