# Ch298 closeout — 2nd wait-loop autopsy; verdict `qbert2_waiting_on_registered_library_state` **Status:** Closed. Observation-only chapter per Codex's framing. **Named verdict:** `qbert2_waiting_on_registered_library_state` (fallback: `qbert2_waiting_on_memory_flag`). qbert polls memory location `0x001329C0` for a non-zero value; nothing in the model ever sets it. No RTL changes. Artifacts: the disassembly + runtime-trace analysis below, and the Ch299 framing proposal at the end. ## The wait loop, fully decoded ### Caller (0x00111308..0x00111314) ``` 0x00111308: 0x0c044950 jal 0x00112540 0x0011130c: 0x0000202d daddu $a0, $zero, $zero ; $a0 = 0 (delay slot) 0x00111310: 0x1040fffd beq $v0, $zero, 0x00111308 ← LOOP BRANCH (TAKEN 144,089×) 0x00111314: 0x3c048000 lui $a0, 0x8000 ; (exit) post-loop ``` ### Leaf (0x00112540..0x00112554) — called 144,089 times ``` 0x00112540: 0x3c020013 lui $v0, 0x13 ; $v0 = 0x00130000 0x00112544: 0x00042080 sll $a0, $a0, 2 ; $a0 <<= 2 (= 0 since $a0_arg=0) 0x00112548: 0x8c43c01c lw $v1, -16356($v0) ; $v1 = *(0x0012C01C) 0x0011254c: 0x00832021 addu $a0, $a0, $v1 ; $a0 = $v1 (since $a0_arg=0) 0x00112550: 0x03e00008 jr $ra ; return 0x00112554: 0x8c820000 lw $v0, 0($a0) ; delay slot: $v0 = *($a0) = *(*(0x0012C01C)) ``` **Effective gate:** `$v0 = *(*(0x0012C01C))`. Caller's branch: `beq $v0, $zero, top` → loop while `*(*(0x0012C01C)) == 0`. ## Runtime data (from trace files) ### IFETCH counts | PC | Count | Role | |----|-------|------| | 0x00111308 (caller JAL) | 144,089 | wait loop top | | 0x0011130c (delay $a0=0) | 144,089 | | | 0x00111310 (caller BEQ) | 144,089 | wait loop branch | | 0x00111314 (lui — exit slot) | 144,089 | | | 0x00112540..0x00112554 (leaf) | 144,089 each | leaf body (jr+ds) | **144,089 iterations** of the wait loop. The leaf is a 6-instruction function reached via JAL from caller; each iteration is 10 instructions (4 caller + 6 leaf). (Note: 0x00112540 shows **288,178** in raw count — 2× others. Examined further: this is because 0x00112540 is also reached as part of a *separate* code path elsewhere in qbert, unrelated to this wait loop. Doesn't affect the analysis.) ### Map-event addresses Top read addresses (matches 144k loop iterations): | Address | Reads | Meaning | |---------|-------|---------| | 0x0012C01C | 144,090 | pointer storage (read each iteration; value = 0x001329C0) | | 0x001329C0 | 144,089 | **the polled flag** (read each iteration; value always 0) | | 0x00112540..0x00112554 | 144,089 each | leaf IFETCHes | | 0x00111308..0x00111314 | 144,089 each | caller IFETCHes | ### Writes to the polled address ``` cycle 39739 MEM WRITE 0x00000000001329c0 0x0000000000000000 ... cycle 98576 MEM WRITE 0x00000000001329c0 0x0000000000000000 ... ``` **Two writes total, both writing 0.** Both happened during init, before the wait loop started. After that, the flag is read 144,089 times and never written. **qbert itself zeroed the flag, then entered the loop expecting an external agent to set it.** ### Map-event region breakdown (full run) | Region | Reads/writes | Notes | |--------|--------------|-------| | USEG_SHADOW (0x0B) | 1,773,235 | qbert's own code+data | | BIOS (0x00) | 4 | early trampoline | | DMAC_CTRL (0x0D) | 1 | Ch287 stub init | | DMAC_PASSIVE (0x0E) | 1 | Ch288 stub init | **Still zero INTC / GS / BIU / general-MMIO traffic.** Same as Ch294's first-loop autopsy: the wait is 100% software-side, no hardware-side polling. ## Syscall 0x7A bucketing (per Codex's instrumentation request) ``` syscall_0x7A_split = count_a0_4=1 count_a0_0x80000000=1 count_a0_other=2 last_a0=0x80000002 first_v0=0 last_v0=0 ``` **The wait loop does NOT call syscall 0x7A.** The leaf at 0x00112540 is pure memory reads. The 4 total 0x7A calls (1+1+2) all happened earlier in qbert's init sequence, NOT in this wait loop. The 0x80000002 shape Codex flagged in Ch297 is an init-side call, not a polling-loop call. So Codex's hypothesis "the wait may be polling 0x7A with $a0= 0x80000002 for a different bit" is **falsified**. The Ch295 0x7A unblock doesn't need broadening to fix this wait — that's a separate concern. ## Verdict, per Codex's enum | Verdict | Fit? | |---------|------| | `qbert2_waiting_on_syscall_7a_bit` | **No** — the loop body doesn't issue any syscalls; the wait is pure memory polling. | | `qbert2_waiting_on_memory_flag` | **Yes** — generic fit; the gate is a memory location, not MMIO. | | `qbert2_waiting_on_mmio` | **No** — 0x001329C0 is EE RAM (region 0x0B), not MMIO. | | `qbert2_waiting_on_registered_library_state` | **Yes — best fit** — the gate sits at qbert's global ctx + 0x100 (0x001328C0 + 0x100 = 0x001329C0); Ch297 just registered two library entries via syscall 0x77; the "library is ready" flag pattern matches what the registration callback would set. | | `qbert2_wait_loop_unknown` | No, fully decoded. | **Pick: `qbert2_waiting_on_registered_library_state`.** The gate sits within the registration context that Ch297's syscall 0x77 calls were setting up. qbert expects whatever registers the library to also set the "ready" flag — our HLE returns $v0=0 and writes nothing. ## What the address 0x001329C0 means - qbert's global ctx pointer (threaded through 0x78/0x12/0x16/0x7A/ 0x79) is **0x001328C0**. - The gate is **0x001329C0 = global_ctx + 0x100** — same data region. - Likely an offset into a kernel-context / library-management struct. ## Ch299 framing — name the gate value first Per Codex's "name the branch mask and expected return value first" discipline: - **Source:** memory at `*(0x0012C01C)` = `*(0x001329C0)`. - **Mask:** none — full 32-bit `!= 0` test. - **Expected value:** any non-zero value. - **Setter:** TBD — nothing in our model currently writes to 0x001329C0. The setter would be the kernel-callback that syscall 0x77 (RegisterLibraryEntries) registered, OR the library-ready-callback mechanism. ### Three Ch299 strategies **A. TB-poke the gate (cheap experiment).** Modify `tb_ee_core_elf_runner.sv` to write 1 to memory address 0x001329C0 at a fixed cycle (e.g., cycle 200,000 — after init is done but before the watchdog). Lets qbert progress. Inelegant but falsifiable. **B. Extend syscall 0x77 HLE to write the status word.** The proper PS2 kernel `RegisterLibraryEntries(buf, ...)` writes a "ready" flag somewhere derived from the buf pointer + library ID. If the layout is `buf->status` at a known offset, the HLE can write a non-zero value there before returning $v0=0. Requires identifying the exact offset that maps to 0x001329C0 from $a0= 0x001DFD50 (Ch297's first call). Difference is 0x001329C0 - 0x001DFD50 = ... negative, so 0x001329C0 is **below** 0x001DFD50. Probably points to a kernel-managed status block, not the registration record. Not trivial without SDK semantics. **C. Architectural — wire interrupt delivery.** If the Ch290/291 DMAC handler at 0x00112AB0 fires and that handler writes to 0x001329C0, the gate opens. Requires modeling DMAC completion → COP0 Cause/Status → handler invocation. Multi-chapter. **My recommendation: Strategy A** (TB-poke). It's the cheapest falsifiable experiment, matches Ch295's "Strategy A first" pattern that worked. If qbert progresses meaningfully, the gate's semantic role is confirmed and Ch300+ can pursue B or C for architectural correctness. If qbert misbranches or crashes, we roll back and pivot. Specifically for Ch299: the TB writes `mem[0x001329C0/16] |= (1<<0)` (or any non-zero value at lane 0) at cycle ~200,000. The runner observer can confirm via a new "tb_poked_gate" counter. ## Files - `/tmp/ch294_disasm.py` — disassembler retargeted to 0x00112520..0x001125A0 then 0x001112E0..0x00111360 to find the caller. Same one-shot diagnostic from Ch294, retargeted by editing LO/HI constants. - This closeout. ## Pattern review (28 chapters; second autopsy) The Ch293→Ch294→Ch295 cycle (inflection → autopsy → unblock) is repeating cleanly at Ch297→Ch298→Ch299. Ch298 produces the same artifact format as Ch294: a *named gate* + a *concrete next-step proposal*. | Inflection | Autopsy | Unblock | |------------|---------|---------| | Ch293 (1.66M retires, hot_pc=0x0011242C) | Ch294 (syscall 0x7A bit-17 poll) | Ch295 ($a0-aware HLE) | | Ch297 (1.47M retires, hot_pc=0x00112554) | **Ch298 (memory poll at 0x001329C0)** | **Ch299 (TB-poke OR HLE write)** | The cycle's reliability (two clean iterations now) suggests this is the right structure for the "post-opcode-era" phase of qbert. Each cycle adds ~1.5M retires of progress. ## Regression Unchanged at **176/176** — no RTL or TB changes in Ch298.