Files
retroDE_ps2/docs/ch297_closeout.md
T
thejayman77 ec82764bef Initial commit: retroDE_ps2 — first-of-its-kind PS2 GS FPGA core (DE25-Nano / Agilex 5)
RTL (GS rasterizer, EE core stub, platform bridge, LPDDR4B path), sim regression
(272 TBs), docs, and tooling. Copyrighted PS2 content (BIOS, game code, GS dumps,
and all dump-derived textures/traces) is excluded via .gitignore and stays local.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-29 20:10:50 -04:00

171 lines
6.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Ch297 closeout — syscall 0x77 HLE; richer observer pays off; another wait loop surfaces
**Status:** Closed. **Verdict from re-running qbert.elf:**
`elf_timeout_with_hot_pc (watchdog after 50000000 ns, 1469235
retires, hot_pc=0x00112554 count=26/256)` — qbert advanced
**28,101 → 1,469,235 retires (+1,441,134)**, then hit another
steady-state wait loop at a NEW hot_pc.
This is the second time the runner has surfaced `elf_timeout_with_hot_pc`
on qbert (after Ch293). Pattern is repeating from Ch293→Ch294:
mechanical syscall HLE chapter unlocks a big advance, then a new
wait loop surfaces requiring investigation.
## What landed
### Dispatcher case — `rtl/ee/ee_core_stub.sv`
8th narrow $v0=0 case in the Ch273 dispatcher:
```sv
32'h0000_0077: begin
regfile[2] <= 32'd0;
gpr128[2] <= 128'd0;
pc <= pc + 32'd4;
retire_pulse <= 1'b1;
state <= S_IFETCH_REQ;
end
```
### TB extension — `tb_ee_core_syscall_hle.sv`
Standard 4-slot subcase. The TB now covers nine known syscall
numbers plus the unknown-halt path. All assertions pass.
### Runner observer — RICHER than prior observers
Per Codex's framing, the 0x77 observer captures more than just
"first call" — it tracks up to **4 distinct ($a0,$a1,$a2,$a3)
tuples** with per-tuple count. Implementation:
```sv
logic syscall_0x77_tuple_valid [0:3];
logic [31:0] syscall_0x77_tuple_a0 [0:3];
... (a1, a2, a3)
int syscall_0x77_tuple_count [0:3];
int syscall_0x77_distinct_tuples;
```
On every syscall 0x77 retire, the observer:
1. Bumps total count.
2. Snapshots first/last args.
3. Looks up the current ($a0,$a1,$a2,$a3) tuple in the table.
If found, increments its count. If not found and a slot is
free, records the new tuple.
This means: at end-of-sim, the SUMMARY block shows whether qbert
made the same call repeatedly (count > 1 with `distinct_tuples =
1`) or iterated over a table (count > 1 with `distinct_tuples > 1`,
with per-tuple counts visible).
**Cost:** ~50 LOC. **Value:** decisive answer to "is qbert calling
this syscall in a loop?"
## The qbert run's smoking gun
```
syscall_0x77 = count=2 distinct_tuples=2
tuple[0] = ($a0=0x001dfd50, $a1=1, $a2=0, $a3=20) count=1
tuple[1] = ($a0=0x001dfdb0, $a1=1, $a2=0, $a3=16) count=1
```
Two distinct calls. The arg-pattern is striking:
- `$a0` increments by **0x60 = 96 bytes** (0x001dfd50 → 0x001dfdb0).
- `$a3` is a count: 20 then 16.
- `$a1 = 1` and `$a2 = 0` constant across calls.
This shape strongly fits a **registration-iteration** call:
- `$a0` = base address of registration record (heap-resident
buffer at 0x001dfd50, then a second record 96 bytes later).
- `$a1 = 1` = mode flag (constant).
- `$a3` = number of entries in the record (20 first, 16 second).
PS2 standard references for syscall 0x77 (= 119) cite plausible
names like `RegisterLibraryEntries` or similar — both consistent
with this 4-tuple shape.
## qbert progression
| Chapter | retire_count | Verdict | Note |
|---------|--------------|---------|------|
| Post-Ch296 (0x79) | 28,101 | elf_first_unhandled_syscall | $v1=0x77 |
| **Post-Ch297 (0x77)** | **1,469,235** | **elf_timeout_with_hot_pc** | **new wait loop at hot_pc=0x00112554** |
**+1.44M retire jump.** Comparable to Ch293's inflection 60× jump.
qbert is back in steady-state-loop territory at a different
hot_pc. This is **Ch298 investigation territory.**
## Cross-observation: syscall 0x7A traffic changed too
```
syscall_0x7A = count=4 (was 3 in Ch295/Ch296)
syscall_0x7A_split = count_a0_4=1 count_a0_0x80000000=1 count_a0_other=2 (was 1)
last_a0=0x80000002
```
qbert called 0x7A four times this run vs three times in
Ch295/296. The extra call is in the "other" bucket
($a0=0x80000002 — close to but not equal to 0x80000000 or 4).
So syscall 0x7A is being used with more arg shapes as qbert
progresses further. The Ch295 $a0-aware fix is *not* generalizing
correctly: $a0=0x80000002 takes the `else` path and returns 0,
which may or may not be what qbert expects. Worth keeping in mind
for downstream debugging.
## Ch298 framing — investigation of the new wait loop
Hot_pc = 0x00112554 with count = 26/256. **This is NOT 0x0011242C**
(Ch293's hot_pc), so it's a different wait loop. Ch298 should
mirror Ch294's autopsy approach:
1. Disassemble 0x00112540..0x001125A0 (~24 instructions around
the new hot_pc).
2. Classify reads/writes in that PC window from the trace file.
3. Identify the branch condition.
4. Pick one of Codex's verdicts:
- `qbert_waiting_on_dmac_handler`
- `qbert_waiting_on_vblank`
- `qbert_waiting_on_thread_scheduler`
- `qbert_waiting_on_memory_flag` (likely, by analogy with Ch294)
- `qbert_wait_loop_unknown`
The richer-observer pattern's `tuple` machinery is reusable for
ANY future investigation chapter — it can be retargeted at
whatever syscall or function the new loop polls.
## Pattern review (27 chapters)
| Phase | Effect |
|-------|--------|
| Opcode-blocker | Ch271..Ch286 |
| MMIO stubs | Ch287..Ch288 |
| Syscall HLE narrow | Ch273/285/289/290/291/293/296/297 |
| Narrow NOP-class | Ch286/292 |
| **Inflection #1** | **Ch293 — first wait loop surfaces** |
| **Investigation #1** | **Ch294 — bit-17 polled flag identified** |
| **Experimental unblock** | **Ch295 — $a0-aware HLE** |
| **Inflection #2** | **Ch297 — second wait loop surfaces** |
| **(Investigation #2)** | **Ch298 — autopsy required** |
The Ch293→Ch294→Ch295 cycle (inflection → autopsy → unblock) took
3 chapters and resulted in a 60× retire-count jump. Ch297 has
surfaced an inflection of comparable magnitude (+1.44M retires).
Ch298 should be the analogous autopsy.
## Files changed
- `rtl/ee/ee_core_stub.sv` — 1 new HLE case (~25 LOC with comment).
- `sim/tb/integration/tb_ee_core_syscall_hle.sv` — 4 new slots +
1 latch + 1 assertion + 1 display field.
- `sim/tb/integration/tb_ee_core_elf_runner.sv` — 6 new state
signals + observer block with distinct-tuple table + SUMMARY
display lines.
No new TB, no new Makefile target; regression count unchanged at
**176/176**.
## Regression
**176/176 PASS** (unchanged from Ch296; no new TB).