Files
retroDE_ps2/docs/ch267_closeout.md
T
thejayman77 ec82764bef Initial commit: retroDE_ps2 — first-of-its-kind PS2 GS FPGA core (DE25-Nano / Agilex 5)
RTL (GS rasterizer, EE core stub, platform bridge, LPDDR4B path), sim regression
(272 TBs), docs, and tooling. Copyrighted PS2 content (BIOS, game code, GS dumps,
and all dump-derived textures/traces) is excluded via .gitignore and stays local.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-29 20:10:50 -04:00

9.2 KiB
Raw Blame History

Ch267 closeout — 0xA000A8C8 is NOT the polled gate. The chain just clears it; nothing reads it.

Status: Closed. Phase 1 passive observation rules out 0xA000A8C8 as a polled gate.

Verdict: gate_only_cleared_never_polled.

Headline counts across the entire BIOS-long run (93 accesses to phys 0x000A8C8, all kseg1 alias):

Role Count
clearer(dispatcher) 69 (3 SWs × 23 dispatcher invocations)
clearer(other) 24 (1 init-time + 23 helper-frame writes)
writer(non-zero) 0
poller(read) 0

Action per Codex's gate: Do NOT proceed to Phase 2 (0xA000A8C8 poke). The address is a write target, not a polled value. The treadmill must be gating on something else.

Codex Ch267 Phase 1 acceptance — line-by-line

Codex requirement Status Where
Key on phys 0x0000A8C8, accept all three kseg/kuseg aliases CH267_PHYS_TARGET = 29'h000_A8C8 (matches low 29 bits of EA)
Capture every EE map access to that word ch267_* arrays, cap=1024
Classify each as clearer / writer / poller ch267_role_name task
Distinguish dispatcher clearer (PC in 0xBFC4F320..F520) vs other ch267_in_disp field
Log PC, access type, value, pass index, pre/post-clear full stream output
Suppress dispatcher clears beyond first-per-pass dc_per_pass[] filter (kept the first, counted+suppressed the rest)
5-way verdict labels gate_alias_mismatch / gate_nonzero_writer_found / gate_polled_zero_no_writer / gate_only_cleared_never_polled / gate_no_traffic_at_all
Regression unaffected 157 / 157 with target off-by-default

What the stream actually showed

One previously-unknown init-time clearer

The very first access to 0xA000A8C8 happens at cyc=54566 (deep BIOS init, pre-treadmill) from PC=0xBFC4B83C:

[0] cyc=54566 pass=0 CLEARER(other) pc=0xbfc4b83c ea=0xa000a8c8(kseg1) data=0x00000000 post_clear=0

This is the first zeroing of 0xA000A8C8 — before the Ch266 dispatcher ever runs. The PC is far from the dispatcher chain; it's somewhere in early kernel init. Not a smoking gun because it writes zero like the dispatcher does, but worth naming so future autopsies don't think it's mysterious.

The "other" clearer pattern in the helper

24 captures at PC=0xBFC4D388 (inside the Ch265 helper, the instruction right after the helper's JAL out to the dispatcher) also write zero to 0xA000A8C8.

This is a trace-timing artefact, not a separate writer. The Ch266 dispatcher's JAL 0xBFC4F334 → jr $ra has a delay slot at 0xBFC4F338; if the delay slot is sw $0, OFF($base), that write retires while core_pc is one cycle ahead, already showing 0xBFC4D388 (the helper's post-JAL instruction). So Ch266 attributed three writes to PCs F32C/F330/F334 inside the dispatcher, but the third write was actually F338 (the JR delay slot), reported with PC=0xBFC4D388 because core_pc sampling is one cycle late on memory events.

Confirmation: every "other" clearer at 0xBFC4D388 fires immediately after a CLEARER(disp) from 0xBFC4F32C (see cyc=67019→67034, 67131, 68243 — 15-cycle gap between the dispatcher write and the "helper" write, matching the JR + delay-slot + pipeline-bubble timing). Three writes per dispatcher call, distributed across what looks like two PCs because of the same one-cycle skew the Ch266 closeout noted.

(Same skew explanation applies to PC=0xBFC4F334 in Ch266's output — it was actually the JR delay slot's write at F338, not a write from the JR itself.)

Net: there's still one writer (the dispatcher), three SWs per call. The autopsy just gave us a clearer picture of which PCs the writes are really attributed to.

Zero pollers, zero non-zero writers — the gate is elsewhere

The crucial counts:

writer(non-zero)    = 0
poller(read)        = 0

No read of 0xA000A8C8 happens anywhere in the model during the BIOS-long run. Combined with the disassembly of the Ch217 outer-caller post-chain:

0xbfc52378: lui $v0, 0x1f80          ; <- clobbers $v0=0xA000A8C8
0xbfc5237c: ori $v0, $v0, 0x1070     ; $v0 now = 0x1F801070
0xbfc52380: sw $0, 4($v0)            ; write 0 to I_MASK
0xbfc52384: jal <next-handler>
0xbfc52388: sw $0, 0($v0)            ; write 0 to I_STAT (W1C ack)

…the outer caller discards $v0=0xA000A8C8 immediately after the chain returns and rebuilds it as 0x1F801070 (IOP INTC I_STAT). The 0xA000A8C8 pointer is never used as a polled value, never used as a data pointer, never used at all by the outer caller.

The chain's job appears to be pure side-effect — clearing the kernel struct at 0xA000A8C8 and updating internal selector-keyed state via the helper ($v1 return values were selector-dependent). The chain's $v0 is computed but discarded.

The polled gate is not at 0xA000A8C8. Ch263Ch266 narrowed the search to "the longjmp-return chain's effect," and Ch267 shows that effect is not a polled value at 0xA000A8C8 itself.

Possible relocations for "where the gate actually lives":

  1. One of the INTC writes the outer caller does immediately after the chain. 0xBFC52380: sw $0, 4($v0) writes 0 to I_MASK; 0xBFC52388: sw $0, 0($v0) does W1C on I_STAT. Both happen every Ch217 pass. Could the treadmill be gated on the I_STAT value AFTER the W1C? If a "ready bit" needed to remain set across the W1C, our INTC model might be eating it.

  2. Elsewhere in the loop body the autopsies haven't covered. The Ch217 caller dump only shows PCs 0xBFC52340..0xBFC5238C — the area immediately around the JAL. The treadmill itself is longer; the polled state might be read further along (post-W1C, post-RFE) before the exception loops back.

  3. A COP0 register, not memory. The treadmill involves an RFE; COP0 Status/Cause/EPC reads aren't in EE_MAP and wouldn't show up in our existing autopsies. A re-poll of Status.IE or Cause.IP between passes could be the gate.

Recommendation for Ch268

Pivot away from 0xA000A8C8 entirely. Three concrete follow-ups, in order of cheapest-first:

(A) Widen Ch267 to scan ALL read EAs in the treadmill window. Instead of keying on one EA, capture every non-fetch READ across a wider PC window — say the Ch217 caller body 0xBFC52340..0xBFC52400. Bucket reads by EA and diff pass 1 vs pass 8. Any EA that BIOS reads every pass and whose value is "the same" deserves the polled-gate label. Cheap to implement — copy the Ch266 capture, widen the PC, drop the write capture, add per-pass diff bookkeeping.

(B) Capture the immediate post-chain INTC writes. Profile the W1C cadence at I_STAT (0x1F801070) and I_MASK (0x1F801074) across passes. If our INTC stub's behavior on those writes differs from what BIOS expects, the treadmill could be gating on I_STAT's residual after W1C.

(C) Observe COP0 reads. Add a minimal COP0 access logger to ee_core_stub. Look for any read of Status/Cause/EPC that returns the same value every pass — that's a candidate for a "this would have changed on a real PS2" gate.

(A) is the highest-EV next step — it directly searches for the gate without committing to a guess. (B) is the second-highest-EV because we have a smoking gun pointing at INTC (selector 0x05 → $v1=0x1F801070). (C) is the fallback if (A) and (B) both come up empty.

Do NOT proceed to Phase 2 (TB-poke of 0xA000A8C8). The Ch267 result rules out 0xA000A8C8 as the gate, so poking it would just confirm that — and possibly confuse the dispatcher's internal selector-state tracking.

Files changed

  • sim/tb/integration/tb_ee_core_bios_smoke.sv — added \ifdef CH267_GATE_OBSERVER block. Single capture (R+W for any EA matching phys 0x000A8C8 across aliases), with per-event PC/value/role/post-clear tags. Stream-suppression for dispatcher clears beyond first-per-pass. SUMMARY block with alias breakdown + role counts. 5-way verdict logic with alias-mismatch detection. Two call sites (ch267_print_observer()`) in halt + timeout exits.
  • sim/Makefile — new tb_ee_core_bios_long_gate_observer target (only -DCH267_GATE_OBSERVER).

iverilog 12 quirks hit

None new. Wrote with the Ch264/265/266 patterns in mind (no return from task; no bit-select on parenthesized expr; trace_pkg:: namespace). Clean first-try compile.

Regression

Full regression: 157 / 157 with the new target off by default (CH267_GATE_OBSERVER undefined for routine builds).

Standing by for Codex's Ch268 call. Recommendation: (A) — wider PC-window read autopsy across the Ch217 caller body, to find what EA the treadmill actually polls. The Ch266 infrastructure is reusable; just widen the PC window and drop the write capture.