ec82764bef
RTL (GS rasterizer, EE core stub, platform bridge, LPDDR4B path), sim regression (272 TBs), docs, and tooling. Copyrighted PS2 content (BIOS, game code, GS dumps, and all dump-derived textures/traces) is excluded via .gitignore and stays local. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
179 lines
6.9 KiB
Markdown
179 lines
6.9 KiB
Markdown
# Ch299 closeout — Strategy B-lite: narrow library-ready gate poke; wait loop collapses
|
|
|
|
**Status:** Closed. Codex's "Strategy B-lite" (TB-side poke
|
|
triggered by narrow syscall 0x77 match) worked first try.
|
|
**Verdict from re-running qbert.elf:**
|
|
`elf_first_unsupported_opcode (pc=0x00110BB4 instr=0x70081EE9)` —
|
|
qbert exited the Ch298 wait loop on iteration 1 and advanced into
|
|
new code, hitting an unimplemented MMI3 sub-op.
|
|
|
|
## What landed
|
|
|
|
The TB-side gate-poke pattern: tb_ee_core_elf_runner now observes
|
|
syscall 0x77 retires and, when the args match the qbert-specific
|
|
narrow guard, writes 1 to the polled memory location.
|
|
|
|
### Implementation — `sim/tb/integration/tb_ee_core_elf_runner.sv`
|
|
|
|
Per Codex's framing ("if direct memory write from syscall FSM is
|
|
awkward, then a TB-side poke is acceptable, but trigger it on
|
|
observing syscall 0x77, not on an arbitrary cycle"):
|
|
|
|
```sv
|
|
localparam logic [31:0] LIBRARY_READY_GATE_ADDR = 32'h0013_29C0;
|
|
localparam logic [19:0] LIBRARY_READY_SHADOW_IDX = 20'h4_CA70;
|
|
localparam logic [31:0] LIBRARY_READY_GATE_VALUE = 32'h0000_0001;
|
|
```
|
|
|
|
Narrow guard:
|
|
```sv
|
|
if ((a0 >= 32'h001D_FD50) && (a0 <= 32'h001D_FDB0)
|
|
&& ((a3 == 32'h0000_0010) || (a3 == 32'h0000_0014))) begin
|
|
u_ee_map.useg_shadow_mem[LIBRARY_READY_SHADOW_IDX] <= LIBRARY_READY_GATE_VALUE;
|
|
library_ready_poke_count <= library_ready_poke_count + 1;
|
|
...
|
|
end
|
|
```
|
|
|
|
The guard matches **exactly** the two arg tuples Ch297 observed
|
|
($a0 ∈ {0x001DFD50, 0x001DFDB0}, $a3 ∈ {0x14, 0x10}). RTL-side
|
|
direct write from the syscall FSM was rejected as too invasive
|
|
(would require a new state and combinational map-driver changes).
|
|
TB-side poke is Codex's authorized fallback.
|
|
|
|
### SUMMARY line — `library_gate`
|
|
|
|
```
|
|
library_gate = addr=0x001329c0 initial=0x00000000 final=0x00000001
|
|
poked=1 poke_count=2 first_poke_cycle=100093
|
|
source=syscall_0x77_narrow_match
|
|
```
|
|
|
|
- **initial** (sampled at cycle 100): 0 (matches Ch298's
|
|
"starts zero" observation).
|
|
- **final** (sampled continuously, latches last value): 1
|
|
(gate is now non-zero, wait condition satisfied).
|
|
- **poke_count = 2**: both qbert-observed 0x77 calls (with
|
|
$a3=0x14 and $a3=0x10) fired the poke.
|
|
- **first_poke_cycle = 100,093**: just after qbert's second init
|
|
zero-write at cycle 98,576 — the order is correct (zero-write
|
|
first, then poke, so the poked-1 doesn't get clobbered).
|
|
- **source = "syscall_0x77_narrow_match"**: the poke fired from
|
|
the narrow-matched syscall observer, NOT a blind cycle-fixed
|
|
poke.
|
|
|
|
## The narrow guard's third-tuple falsifier
|
|
|
|
The qbert run after Ch299 shows a **THIRD** distinct 0x77 tuple:
|
|
|
|
```
|
|
syscall_0x77 = count=3 distinct_tuples=3
|
|
tuple[0] = ($a0=0x001dfd50, $a3=0x14) count=1 ← matches guard, fires poke
|
|
tuple[1] = ($a0=0x001dfdb0, $a3=0x10) count=1 ← matches guard, fires poke
|
|
tuple[2] = ($a0=0x001dfd70, $a3=0x40) count=1 ← $a3 outside guard, NO poke
|
|
```
|
|
|
|
The new third call wasn't visible in Ch297's qbert run because
|
|
the wait loop blocked qbert from making it. With the Ch299 gate
|
|
opening, qbert advanced past the wait loop and made this third
|
|
0x77 call before hitting the opcode trap.
|
|
|
|
**The narrow guard correctly excluded the third tuple** ($a3=0x40
|
|
is not in {0x10, 0x14}). poke_count=2 (not 3) confirms it. This
|
|
is exactly the falsifiability surface Codex asked for — if the
|
|
guard were too broad, poke_count would equal count_0x77 even when
|
|
new arg shapes surface.
|
|
|
|
## qbert progression
|
|
|
|
| Chapter | Blocker | retire_count | Notes |
|
|
|---|---|---|---|
|
|
| Post-Ch297 (0x77) | wait loop spinning | 1,469,235 (watchdog) | gate never set |
|
|
| **Post-Ch299 (gate poke)** | **MMI3 opcode trap at 0x70081EE9** | **28,655** | gate→1 at cycle 100,093; loop exits iter 1 |
|
|
|
|
The retire count *appears* smaller (28,655 < 1,469,235) but
|
|
that's misleading — Ch297's number included the 1.44M spin. The
|
|
MEANINGFUL signal is the **verdict-shape change** from
|
|
`elf_timeout_with_hot_pc` (stuck) → `elf_first_unsupported_opcode`
|
|
(concrete next demand). Same shape transition as Ch295.
|
|
|
|
## Ch300 framing — new MMI3 sub-op at sa=0x1B
|
|
|
|
The new trap is opcode `0x70081EE9` at PC 0x00110BB4. Decode:
|
|
- opcode = `011100` = 0x1C (MMI)
|
|
- rs = `00000` = $0
|
|
- rt = `01000` = 8 = $t0
|
|
- rd = `00011` = 3 = $v1
|
|
- sa = `11011` = 0x1B (= 27)
|
|
- funct = `101001` = 0x29 = MMI3
|
|
|
|
So this is **MMI3 / sa = 0x1B**, an unimplemented MMI3 sub-op.
|
|
Our current MMI3 coverage:
|
|
- sa 0x0E = PCPYUD (Ch283)
|
|
- sa 0x13 = PNOR (Ch281)
|
|
|
|
sa 0x1B is **new**. Per R5900 references, possibilities:
|
|
- **PEXEH** (Parallel Exchange Even Halfword) — sa 0x1A in some
|
|
sources
|
|
- **PREVH** (Parallel Reverse Halfword) — sa 0x1B
|
|
- **PEXCH** (Parallel Exchange Center Halfword) — sa 0x1A
|
|
|
|
If sa 0x1B is PREVH: reverses the order of 16-bit halfwords
|
|
within each 64-bit doubleword.
|
|
|
|
Mechanical Ch300 chapter: extend MMI3 narrow-decode (Ch278
|
|
pattern) with `MMI3_PREVH = 5'h1B`, add `is_prevh`, add the
|
|
writeback arm that implements halfword reversal across the
|
|
128-bit shadow (similar to PCPYUD's full-128 writeback). ~4-5
|
|
RTL edits + focused TB.
|
|
|
|
This is **back to opcode-era for one chapter** — fitting since
|
|
Ch299 cleared the wait loop and qbert progressed to executable
|
|
code with new MMI demands.
|
|
|
|
## Pattern milestone
|
|
|
|
The third clean "inflection → autopsy → unblock" cycle is **not**
|
|
needed yet. Ch299 successfully unblocked the second wait loop,
|
|
and qbert is back in opcode-trap mode. The pattern can be
|
|
sequenced more flexibly than I expected:
|
|
|
|
| Cycle | Inflection | Autopsy | Unblock |
|
|
|-------|------------|---------|---------|
|
|
| 1 | Ch293 (1.66M, 0x0011242C) | Ch294 (syscall 0x7A bit-17) | Ch295 ($a0-aware HLE) |
|
|
| 2 | Ch297 (1.47M, 0x00112554) | Ch298 (memory poll 0x001329C0) | **Ch299 (narrow 0x77 gate poke)** |
|
|
|
|
## Documentation status: qbert-specific HLE
|
|
|
|
Per Codex's instruction: "document this as a qbert-specific
|
|
library-ready HLE, not architectural truth."
|
|
|
|
This is explicitly **NOT** a faithful model of PS2 kernel
|
|
behavior. The real PS2 kernel's RegisterLibraryEntries writes a
|
|
"library ready" word based on the registration record layout +
|
|
the registered library's status. Our TB-side poke writes 1 to a
|
|
hardcoded address that happens to match qbert's specific poll
|
|
target.
|
|
|
|
Risks if another ELF uses syscall 0x77:
|
|
- A different ELF with $a0 in the same range AND $a3 in {0x10,
|
|
0x14} would also get its 0x001329C0 word poked to 1 —
|
|
potentially wrong if the ELF expects 0 or a different value.
|
|
- An ELF with different registration buffer addresses won't get
|
|
the poke at all (correct, since the guard is narrow).
|
|
|
|
The risk is **low for qbert** but should be revisited if Ch300+
|
|
surfaces another ELF or another syscall pattern in the same area.
|
|
|
|
## Files changed
|
|
|
|
- `sim/tb/integration/tb_ee_core_elf_runner.sv` — 6 new state
|
|
signals + observer arm with narrow guard + SUMMARY display.
|
|
|
|
No RTL changes. No new TB target. Regression count unchanged at
|
|
**176/176**.
|
|
|
|
## Regression
|
|
|
|
**176/176 PASS** (unchanged from Ch298; runner-only changes).
|