Files
retroDE_ps2/docs/ch297_closeout.md
T
thejayman77 ec82764bef Initial commit: retroDE_ps2 — first-of-its-kind PS2 GS FPGA core (DE25-Nano / Agilex 5)
RTL (GS rasterizer, EE core stub, platform bridge, LPDDR4B path), sim regression
(272 TBs), docs, and tooling. Copyrighted PS2 content (BIOS, game code, GS dumps,
and all dump-derived textures/traces) is excluded via .gitignore and stays local.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-29 20:10:50 -04:00

6.1 KiB
Raw Blame History

Ch297 closeout — syscall 0x77 HLE; richer observer pays off; another wait loop surfaces

Status: Closed. Verdict from re-running qbert.elf: elf_timeout_with_hot_pc (watchdog after 50000000 ns, 1469235 retires, hot_pc=0x00112554 count=26/256) — qbert advanced 28,101 → 1,469,235 retires (+1,441,134), then hit another steady-state wait loop at a NEW hot_pc.

This is the second time the runner has surfaced elf_timeout_with_hot_pc on qbert (after Ch293). Pattern is repeating from Ch293→Ch294: mechanical syscall HLE chapter unlocks a big advance, then a new wait loop surfaces requiring investigation.

What landed

Dispatcher case — rtl/ee/ee_core_stub.sv

8th narrow $v0=0 case in the Ch273 dispatcher:

32'h0000_0077: begin
    regfile[2]   <= 32'd0;
    gpr128[2]    <= 128'd0;
    pc           <= pc + 32'd4;
    retire_pulse <= 1'b1;
    state        <= S_IFETCH_REQ;
end

TB extension — tb_ee_core_syscall_hle.sv

Standard 4-slot subcase. The TB now covers nine known syscall numbers plus the unknown-halt path. All assertions pass.

Runner observer — RICHER than prior observers

Per Codex's framing, the 0x77 observer captures more than just "first call" — it tracks up to 4 distinct ($a0,$a1,$a2,$a3) tuples with per-tuple count. Implementation:

logic         syscall_0x77_tuple_valid [0:3];
logic [31:0]  syscall_0x77_tuple_a0    [0:3];
... (a1, a2, a3)
int           syscall_0x77_tuple_count [0:3];
int           syscall_0x77_distinct_tuples;

On every syscall 0x77 retire, the observer:

  1. Bumps total count.
  2. Snapshots first/last args.
  3. Looks up the current ($a0,$a1,$a2,$a3) tuple in the table. If found, increments its count. If not found and a slot is free, records the new tuple.

This means: at end-of-sim, the SUMMARY block shows whether qbert made the same call repeatedly (count > 1 with distinct_tuples = 1) or iterated over a table (count > 1 with distinct_tuples > 1, with per-tuple counts visible).

Cost: ~50 LOC. Value: decisive answer to "is qbert calling this syscall in a loop?"

The qbert run's smoking gun

syscall_0x77 = count=2 distinct_tuples=2
  tuple[0] = ($a0=0x001dfd50, $a1=1, $a2=0, $a3=20) count=1
  tuple[1] = ($a0=0x001dfdb0, $a1=1, $a2=0, $a3=16) count=1

Two distinct calls. The arg-pattern is striking:

  • $a0 increments by 0x60 = 96 bytes (0x001dfd50 → 0x001dfdb0).
  • $a3 is a count: 20 then 16.
  • $a1 = 1 and $a2 = 0 constant across calls.

This shape strongly fits a registration-iteration call:

  • $a0 = base address of registration record (heap-resident buffer at 0x001dfd50, then a second record 96 bytes later).
  • $a1 = 1 = mode flag (constant).
  • $a3 = number of entries in the record (20 first, 16 second).

PS2 standard references for syscall 0x77 (= 119) cite plausible names like RegisterLibraryEntries or similar — both consistent with this 4-tuple shape.

qbert progression

Chapter retire_count Verdict Note
Post-Ch296 (0x79) 28,101 elf_first_unhandled_syscall $v1=0x77
Post-Ch297 (0x77) 1,469,235 elf_timeout_with_hot_pc new wait loop at hot_pc=0x00112554

+1.44M retire jump. Comparable to Ch293's inflection 60× jump. qbert is back in steady-state-loop territory at a different hot_pc. This is Ch298 investigation territory.

Cross-observation: syscall 0x7A traffic changed too

syscall_0x7A = count=4 (was 3 in Ch295/Ch296)
syscall_0x7A_split = count_a0_4=1 count_a0_0x80000000=1 count_a0_other=2 (was 1)
                     last_a0=0x80000002

qbert called 0x7A four times this run vs three times in Ch295/296. The extra call is in the "other" bucket ($a0=0x80000002 — close to but not equal to 0x80000000 or 4).

So syscall 0x7A is being used with more arg shapes as qbert progresses further. The Ch295 $a0-aware fix is not generalizing correctly: $a0=0x80000002 takes the else path and returns 0, which may or may not be what qbert expects. Worth keeping in mind for downstream debugging.

Ch298 framing — investigation of the new wait loop

Hot_pc = 0x00112554 with count = 26/256. This is NOT 0x0011242C (Ch293's hot_pc), so it's a different wait loop. Ch298 should mirror Ch294's autopsy approach:

  1. Disassemble 0x00112540..0x001125A0 (~24 instructions around the new hot_pc).
  2. Classify reads/writes in that PC window from the trace file.
  3. Identify the branch condition.
  4. Pick one of Codex's verdicts:
    • qbert_waiting_on_dmac_handler
    • qbert_waiting_on_vblank
    • qbert_waiting_on_thread_scheduler
    • qbert_waiting_on_memory_flag (likely, by analogy with Ch294)
    • qbert_wait_loop_unknown

The richer-observer pattern's tuple machinery is reusable for ANY future investigation chapter — it can be retargeted at whatever syscall or function the new loop polls.

Pattern review (27 chapters)

Phase Effect
Opcode-blocker Ch271..Ch286
MMIO stubs Ch287..Ch288
Syscall HLE narrow Ch273/285/289/290/291/293/296/297
Narrow NOP-class Ch286/292
Inflection #1 Ch293 — first wait loop surfaces
Investigation #1 Ch294 — bit-17 polled flag identified
Experimental unblock Ch295 — $a0-aware HLE
Inflection #2 Ch297 — second wait loop surfaces
(Investigation #2) Ch298 — autopsy required

The Ch293→Ch294→Ch295 cycle (inflection → autopsy → unblock) took 3 chapters and resulted in a 60× retire-count jump. Ch297 has surfaced an inflection of comparable magnitude (+1.44M retires). Ch298 should be the analogous autopsy.

Files changed

  • rtl/ee/ee_core_stub.sv — 1 new HLE case (~25 LOC with comment).
  • sim/tb/integration/tb_ee_core_syscall_hle.sv — 4 new slots + 1 latch + 1 assertion + 1 display field.
  • sim/tb/integration/tb_ee_core_elf_runner.sv — 6 new state signals + observer block with distinct-tuple table + SUMMARY display lines.

No new TB, no new Makefile target; regression count unchanged at 176/176.

Regression

176/176 PASS (unchanged from Ch296; no new TB).