ec82764bef
RTL (GS rasterizer, EE core stub, platform bridge, LPDDR4B path), sim regression (272 TBs), docs, and tooling. Copyrighted PS2 content (BIOS, game code, GS dumps, and all dump-derived textures/traces) is excluded via .gitignore and stays local. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
171 lines
6.1 KiB
Markdown
171 lines
6.1 KiB
Markdown
# Ch297 closeout — syscall 0x77 HLE; richer observer pays off; another wait loop surfaces
|
||
|
||
**Status:** Closed. **Verdict from re-running qbert.elf:**
|
||
`elf_timeout_with_hot_pc (watchdog after 50000000 ns, 1469235
|
||
retires, hot_pc=0x00112554 count=26/256)` — qbert advanced
|
||
**28,101 → 1,469,235 retires (+1,441,134)**, then hit another
|
||
steady-state wait loop at a NEW hot_pc.
|
||
|
||
This is the second time the runner has surfaced `elf_timeout_with_hot_pc`
|
||
on qbert (after Ch293). Pattern is repeating from Ch293→Ch294:
|
||
mechanical syscall HLE chapter unlocks a big advance, then a new
|
||
wait loop surfaces requiring investigation.
|
||
|
||
## What landed
|
||
|
||
### Dispatcher case — `rtl/ee/ee_core_stub.sv`
|
||
|
||
8th narrow $v0=0 case in the Ch273 dispatcher:
|
||
|
||
```sv
|
||
32'h0000_0077: begin
|
||
regfile[2] <= 32'd0;
|
||
gpr128[2] <= 128'd0;
|
||
pc <= pc + 32'd4;
|
||
retire_pulse <= 1'b1;
|
||
state <= S_IFETCH_REQ;
|
||
end
|
||
```
|
||
|
||
### TB extension — `tb_ee_core_syscall_hle.sv`
|
||
|
||
Standard 4-slot subcase. The TB now covers nine known syscall
|
||
numbers plus the unknown-halt path. All assertions pass.
|
||
|
||
### Runner observer — RICHER than prior observers
|
||
|
||
Per Codex's framing, the 0x77 observer captures more than just
|
||
"first call" — it tracks up to **4 distinct ($a0,$a1,$a2,$a3)
|
||
tuples** with per-tuple count. Implementation:
|
||
|
||
```sv
|
||
logic syscall_0x77_tuple_valid [0:3];
|
||
logic [31:0] syscall_0x77_tuple_a0 [0:3];
|
||
... (a1, a2, a3)
|
||
int syscall_0x77_tuple_count [0:3];
|
||
int syscall_0x77_distinct_tuples;
|
||
```
|
||
|
||
On every syscall 0x77 retire, the observer:
|
||
1. Bumps total count.
|
||
2. Snapshots first/last args.
|
||
3. Looks up the current ($a0,$a1,$a2,$a3) tuple in the table.
|
||
If found, increments its count. If not found and a slot is
|
||
free, records the new tuple.
|
||
|
||
This means: at end-of-sim, the SUMMARY block shows whether qbert
|
||
made the same call repeatedly (count > 1 with `distinct_tuples =
|
||
1`) or iterated over a table (count > 1 with `distinct_tuples > 1`,
|
||
with per-tuple counts visible).
|
||
|
||
**Cost:** ~50 LOC. **Value:** decisive answer to "is qbert calling
|
||
this syscall in a loop?"
|
||
|
||
## The qbert run's smoking gun
|
||
|
||
```
|
||
syscall_0x77 = count=2 distinct_tuples=2
|
||
tuple[0] = ($a0=0x001dfd50, $a1=1, $a2=0, $a3=20) count=1
|
||
tuple[1] = ($a0=0x001dfdb0, $a1=1, $a2=0, $a3=16) count=1
|
||
```
|
||
|
||
Two distinct calls. The arg-pattern is striking:
|
||
- `$a0` increments by **0x60 = 96 bytes** (0x001dfd50 → 0x001dfdb0).
|
||
- `$a3` is a count: 20 then 16.
|
||
- `$a1 = 1` and `$a2 = 0` constant across calls.
|
||
|
||
This shape strongly fits a **registration-iteration** call:
|
||
- `$a0` = base address of registration record (heap-resident
|
||
buffer at 0x001dfd50, then a second record 96 bytes later).
|
||
- `$a1 = 1` = mode flag (constant).
|
||
- `$a3` = number of entries in the record (20 first, 16 second).
|
||
|
||
PS2 standard references for syscall 0x77 (= 119) cite plausible
|
||
names like `RegisterLibraryEntries` or similar — both consistent
|
||
with this 4-tuple shape.
|
||
|
||
## qbert progression
|
||
|
||
| Chapter | retire_count | Verdict | Note |
|
||
|---------|--------------|---------|------|
|
||
| Post-Ch296 (0x79) | 28,101 | elf_first_unhandled_syscall | $v1=0x77 |
|
||
| **Post-Ch297 (0x77)** | **1,469,235** | **elf_timeout_with_hot_pc** | **new wait loop at hot_pc=0x00112554** |
|
||
|
||
**+1.44M retire jump.** Comparable to Ch293's inflection 60× jump.
|
||
qbert is back in steady-state-loop territory at a different
|
||
hot_pc. This is **Ch298 investigation territory.**
|
||
|
||
## Cross-observation: syscall 0x7A traffic changed too
|
||
|
||
```
|
||
syscall_0x7A = count=4 (was 3 in Ch295/Ch296)
|
||
syscall_0x7A_split = count_a0_4=1 count_a0_0x80000000=1 count_a0_other=2 (was 1)
|
||
last_a0=0x80000002
|
||
```
|
||
|
||
qbert called 0x7A four times this run vs three times in
|
||
Ch295/296. The extra call is in the "other" bucket
|
||
($a0=0x80000002 — close to but not equal to 0x80000000 or 4).
|
||
|
||
So syscall 0x7A is being used with more arg shapes as qbert
|
||
progresses further. The Ch295 $a0-aware fix is *not* generalizing
|
||
correctly: $a0=0x80000002 takes the `else` path and returns 0,
|
||
which may or may not be what qbert expects. Worth keeping in mind
|
||
for downstream debugging.
|
||
|
||
## Ch298 framing — investigation of the new wait loop
|
||
|
||
Hot_pc = 0x00112554 with count = 26/256. **This is NOT 0x0011242C**
|
||
(Ch293's hot_pc), so it's a different wait loop. Ch298 should
|
||
mirror Ch294's autopsy approach:
|
||
|
||
1. Disassemble 0x00112540..0x001125A0 (~24 instructions around
|
||
the new hot_pc).
|
||
2. Classify reads/writes in that PC window from the trace file.
|
||
3. Identify the branch condition.
|
||
4. Pick one of Codex's verdicts:
|
||
- `qbert_waiting_on_dmac_handler`
|
||
- `qbert_waiting_on_vblank`
|
||
- `qbert_waiting_on_thread_scheduler`
|
||
- `qbert_waiting_on_memory_flag` (likely, by analogy with Ch294)
|
||
- `qbert_wait_loop_unknown`
|
||
|
||
The richer-observer pattern's `tuple` machinery is reusable for
|
||
ANY future investigation chapter — it can be retargeted at
|
||
whatever syscall or function the new loop polls.
|
||
|
||
## Pattern review (27 chapters)
|
||
|
||
| Phase | Effect |
|
||
|-------|--------|
|
||
| Opcode-blocker | Ch271..Ch286 |
|
||
| MMIO stubs | Ch287..Ch288 |
|
||
| Syscall HLE narrow | Ch273/285/289/290/291/293/296/297 |
|
||
| Narrow NOP-class | Ch286/292 |
|
||
| **Inflection #1** | **Ch293 — first wait loop surfaces** |
|
||
| **Investigation #1** | **Ch294 — bit-17 polled flag identified** |
|
||
| **Experimental unblock** | **Ch295 — $a0-aware HLE** |
|
||
| **Inflection #2** | **Ch297 — second wait loop surfaces** |
|
||
| **(Investigation #2)** | **Ch298 — autopsy required** |
|
||
|
||
The Ch293→Ch294→Ch295 cycle (inflection → autopsy → unblock) took
|
||
3 chapters and resulted in a 60× retire-count jump. Ch297 has
|
||
surfaced an inflection of comparable magnitude (+1.44M retires).
|
||
Ch298 should be the analogous autopsy.
|
||
|
||
## Files changed
|
||
|
||
- `rtl/ee/ee_core_stub.sv` — 1 new HLE case (~25 LOC with comment).
|
||
- `sim/tb/integration/tb_ee_core_syscall_hle.sv` — 4 new slots +
|
||
1 latch + 1 assertion + 1 display field.
|
||
- `sim/tb/integration/tb_ee_core_elf_runner.sv` — 6 new state
|
||
signals + observer block with distinct-tuple table + SUMMARY
|
||
display lines.
|
||
|
||
No new TB, no new Makefile target; regression count unchanged at
|
||
**176/176**.
|
||
|
||
## Regression
|
||
|
||
**176/176 PASS** (unchanged from Ch296; no new TB).
|