RTL (GS rasterizer, EE core stub, platform bridge, LPDDR4B path), sim regression (272 TBs), docs, and tooling. Copyrighted PS2 content (BIOS, game code, GS dumps, and all dump-derived textures/traces) is excluded via .gitignore and stays local. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
9.5 KiB
Ch263 closeout — kernel-data mutation reaches BIOS but treadmill unchanged
Status: Closed exactly per Codex's Ch263 framing. Routine
BIOS-long target unchanged. New opt-in target lands the Ch261/Ch262
responder DMA payload into the BIOS-polled kernel-data scan range,
verifies the write reaches the EE RAM, and confirms BIOS observes
the mutation (then scrubs it). Verdict:
kernel_mutation_observed_no_flow_change.
Codex Ch263 acceptance — line-by-line
| Codex requirement | Status | Where |
|---|---|---|
| No new RTL if avoidable | ✅ | TB-only change; no RTL touched |
| Keep Ch261 responder and Ch262 interrupt pulse | ✅ | All Ch262 wiring intact; Ch263 only retargets DMA destination |
| Change only responder DMA destination/payload | ✅ | DEST_BASE_ADDR 0x00080000 → 0x00030200; no payload change |
| Choose one BIOS-polled kernel-data address | ✅ | 0x80030200 (virt) / 0x00030200 (phys) — mid-range slot in the 16 KiB BIOS scan |
| Log baseline value at address before DMA | ✅ | Ch263 baseline = 0x000…000 (all-zero, as expected) |
| Log responder write value | ✅ | Ch263 responder wrote 0xcafef00d12345678c0ffee00deadbeef to EE-phys 0x00030200 at t=50001285000 |
| Log later BIOS reads of same address | ✅ | Trace shows 17 BIOS reads at 0x80030200 across the test |
| Report whether BIOS observes the mutation | ✅ | YES — BIOS reads + actively clears the slot post-write |
| Report whether treadmill state changes | ✅ | NO — retire count, Ch217 passes, Ch218 INTC summary all byte-identical to Ch260 baseline |
| Avoid Pivot 2 unless this returns clean negative | ✅ | Following the rule; deferring 0x1fa00000 question to Ch264 |
| Full regression green | ✅ | 157 / 157 with Ch263 off by default |
Verdict logic — three-way classification
Codex framed three possible outcomes:
kernel_mutation_unobserved— BIOS never reads the slotkernel_mutation_observed_no_flow_change— BIOS reads + W1Cs, no progress (← THIS RUN)kernel_mutation_perturbed_flow— BIOS reads + path changes (= we found a gate)
The trace evidence + treadmill metrics put this run squarely in the middle bucket.
What the trace actually showed
Step 1 — BIOS scans the 0x80030000–0x80033FF0 range every pass
From ee_bios_smoke_map.trace:
Total MEM READ in 0x80030xxx range: 1,217,848
Total MEM WRITE in 0x80030xxx range: 32,768
That is 4,096 writes per pass × 8 passes — BIOS clears the entire 16 KiB kernel-data table once per pass. Every slot gets zeroed every pass. This pattern was visible in the Ch218 v5 capture but not characterized as a scrub until Ch263.
Step 2 — the responder's write lands at our target slot
cycle 5,000,125 MEM WRITE 0x00030200 data=0xc0ffee00deadbeef region=1 flags=0x01
(arg1 only carries the low 64 bits of the bridge's 128-bit qword
write — schema artifact. The qword is 0xcafef00d12345678c0ffee00deadbeef
per the Ch263 responder wrote diagnostic line.)
Step 3 — BIOS observes the value and clears it
Reads at virt 0x80030200 across the run:
cycle 770,570 — BIOS init read, slot zero
cycle 1,287,787 — BIOS init verify
cycle 5,000,125 — RESPONDER WRITES (between BIOS reads)
cycle 10,671,220 — BIOS read after responder write (likely sees 0xcafef00d…)
cycle 11,186,947 — BIOS writes 0 (clears our value)
cycle 11,188,437 — BIOS reads (sees zero now)
cycle 20,571,870 — next pass read
…
The arg1=0 in the trace for EV_READ events is hardcoded
(documented in Ch258), so we can't directly READ the returned
value from the trace. But the WRITE-ZERO at cycle 11,186,947
immediately followed by a verify read at 11,188,437 is consistent
with BIOS reading non-zero data at cycle 10,671,220, deciding to
scrub, and verifying the clear.
Step 4 — treadmill state did not change
| Metric | Ch260 baseline | Ch262 (responder pulse) | Ch263 (mutation + pulse) |
|---|---|---|---|
| Ch217 caller passes | 8 | 8 | 8 (same) |
| Ch217 verdict | static_state | static_state | static_state (same) |
| Ch218 INTC summary | (filtered set) | (same) | (same) |
| Ch218 INTC verdict | intc_quiet | intc_pending_observed | intc_pending_observed (same) |
| Retire count | 24,029,051 | 24,029,051 | 24,029,051 (byte-identical) |
Interpretation
BIOS sees mutations in the kernel-data table but is structurally defended against them via a periodic-scrub kernel routine. The scrub clears the entire 16 KiB region every Ch217 pass; any value we write into a slot lives only until BIOS's next scrub pass, at which point it's zeroed. Whatever the longjmp callee is gated on, either:
- It isn't in this scanned region — the scrub means BIOS
itself doesn't rely on accumulated state in slots
0x80030000-3FF0. The region might be a fresh-init scratchpad that BIOS expects to recompute each pass, not a kernel state table. - It is in this region but BIOS reads the slot's value DURING the pass, not as latched state across passes — and the pass timing is such that our write doesn't land in the right window.
Either way, single-shot writes into this region are not the gate.
What's next (for Codex's Ch264 call)
Two distinct candidates given the new "BIOS scrubs every pass" finding:
(A) Sustained / re-emitted mutation. If BIOS scrubs every pass, a one-shot write loses to the scrub. The Ch263 responder could be retriggered EVERY PASS (e.g. driven by a Ch217-pass-edge signal) so the slot is re-set after each scrub. This tests whether BIOS reads the value MID-PASS before scrubbing — and if so, whether sustained value-presence eventually perturbs flow. The downside: now we're polluting the very table BIOS is managing, which could mask other behavior.
(B) Pivot to 0x1fa00000 (the deferred Pivot 2 from the Ch263 pre-brief). BIOS writes here 46 times with a sequence of values 0x0..0xF. That's a "progress code" or "handshake state output" port pattern. Maybe BIOS expects to read back what it just wrote — or expects an external observer to see those writes and respond. Lower risk than (A) and qualitatively different (output, not polled input).
(C) Look elsewhere entirely. The Ch218 v7 capture showed
the longjmp callee at 0xBFC52984 makes the same JAL with
identical $a0/$a1/$v0 every pass. The callee's body reads
from somewhere — but not from the 0x80030000+ region (per
Ch263). What does it read? Re-running Ch218 in the Ch263 build
with the scoping filter widened (or scoped to the callee's PC
window) could surface the actual polled location.
My recommendation
(C) first, then (B), then (A) if both negative.
Reasoning: Ch263's null result narrows the search significantly.
BIOS isn't gated on the scrubbed kernel-data table, isn't gated
on INTC pending alone (Ch262), isn't gated on PCR (Ch258), and
isn't gated on SMFLG (Ch263 pre-brief). What HASN'T been ruled
out is whatever the callee's body actually reads to compute
its return value. That's an empirical question Ch264 can
answer with another scoped Ch218-style observer — narrow the
capture to PCs inside the callee's body (0xBFC52984.. + ~16
instructions) and see what addresses it touches.
If (C) returns "callee reads from address X" and X is unmapped or zero, then THAT becomes the next Ch265 target.
If (C) is inconclusive (callee uses only register state), then
(B) — 0x1fa00000 — is the next-best surface to investigate.
(A) is last-resort: throwing the SAME thing at BIOS but harder is unlikely to produce different qualitative behavior.
Files changed
sim/tb/integration/tb_ee_core_bios_smoke.sv— Ch263 sub-\ifdefinside the Ch262 block: gate the localu_ch262_ee_ram, overrideCH262_EE_LANDINGto phys0x00030200, add theee_map_br_*priority mux that routes responder bridge writes into the BIOS-long sharedu_ee_ram, add Ch263 observer (baseline + responder-write event + BIOS reads counter + three-way verdict infinal` block).sim/Makefile— newtb_ee_core_bios_long_kernel_mutatetarget.docs/ch263_pre_impl_brief.md— the recon-first brief that surfaced the SIF-mailbox-unobserved finding and proposed Pivot 3.docs/ch263_closeout.md— this file.
Caveat: the final block summary print didn't fire on this
run (iverilog 12 quirk with final + $finish on
$error-triggered timeout). The data was reconstructed from
the inline $display events + trace-file analysis. A future
chapter could either move the summary into an always_ff on
end-of-test or pre-emptively print at every Ch217 pass.
Standing by for Codex's Ch264 call.