ec82764bef
RTL (GS rasterizer, EE core stub, platform bridge, LPDDR4B path), sim regression (272 TBs), docs, and tooling. Copyrighted PS2 content (BIOS, game code, GS dumps, and all dump-derived textures/traces) is excluded via .gitignore and stays local. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
209 lines
9.2 KiB
Markdown
209 lines
9.2 KiB
Markdown
# Ch267 closeout — `0xA000A8C8` is NOT the polled gate. The chain just clears it; nothing reads it.
|
||
|
||
**Status:** Closed. Phase 1 passive observation **rules out**
|
||
`0xA000A8C8` as a polled gate.
|
||
|
||
**Verdict:** `gate_only_cleared_never_polled`.
|
||
|
||
**Headline counts** across the entire BIOS-long run (93 accesses
|
||
to phys `0x000A8C8`, all kseg1 alias):
|
||
|
||
| Role | Count |
|
||
|--------------------|-------|
|
||
| clearer(dispatcher) | 69 (3 SWs × 23 dispatcher invocations) |
|
||
| clearer(other) | 24 (1 init-time + 23 helper-frame writes) |
|
||
| writer(non-zero) | **0** |
|
||
| poller(read) | **0** |
|
||
|
||
**Action per Codex's gate:** Do **NOT** proceed to Phase 2
|
||
(`0xA000A8C8` poke). The address is a *write target*, not a
|
||
polled value. The treadmill must be gating on something else.
|
||
|
||
## Codex Ch267 Phase 1 acceptance — line-by-line
|
||
|
||
| Codex requirement | Status | Where |
|
||
|-------------------------------------------------------------------------------------|--------|-------|
|
||
| Key on phys 0x0000A8C8, accept all three kseg/kuseg aliases | ✅ | `CH267_PHYS_TARGET = 29'h000_A8C8` (matches low 29 bits of EA) |
|
||
| Capture every EE map access to that word | ✅ | `ch267_*` arrays, cap=1024 |
|
||
| Classify each as clearer / writer / poller | ✅ | `ch267_role_name` task |
|
||
| Distinguish dispatcher clearer (PC in 0xBFC4F320..F520) vs other | ✅ | `ch267_in_disp` field |
|
||
| Log PC, access type, value, pass index, pre/post-clear | ✅ | full stream output |
|
||
| Suppress dispatcher clears beyond first-per-pass | ✅ | `dc_per_pass[]` filter (kept the first, counted+suppressed the rest) |
|
||
| 5-way verdict labels | ✅ | gate_alias_mismatch / gate_nonzero_writer_found / gate_polled_zero_no_writer / gate_only_cleared_never_polled / gate_no_traffic_at_all |
|
||
| Regression unaffected | ✅ | 157 / 157 with target off-by-default |
|
||
|
||
## What the stream actually showed
|
||
|
||
### One previously-unknown init-time clearer
|
||
|
||
The very first access to `0xA000A8C8` happens at **cyc=54566**
|
||
(deep BIOS init, pre-treadmill) from **PC=0xBFC4B83C**:
|
||
|
||
```
|
||
[0] cyc=54566 pass=0 CLEARER(other) pc=0xbfc4b83c ea=0xa000a8c8(kseg1) data=0x00000000 post_clear=0
|
||
```
|
||
|
||
This is the *first* zeroing of `0xA000A8C8` — before the Ch266
|
||
dispatcher ever runs. The PC is far from the dispatcher chain;
|
||
it's somewhere in early kernel init. Not a smoking gun
|
||
because it writes zero like the dispatcher does, but worth
|
||
naming so future autopsies don't think it's mysterious.
|
||
|
||
### The "other" clearer pattern in the helper
|
||
|
||
24 captures at **PC=0xBFC4D388** (inside the Ch265 helper, the
|
||
instruction right after the helper's JAL out to the dispatcher)
|
||
also write zero to `0xA000A8C8`.
|
||
|
||
This is a **trace-timing artefact**, not a separate writer.
|
||
The Ch266 dispatcher's JAL `0xBFC4F334 → jr $ra` has a delay
|
||
slot at `0xBFC4F338`; if the delay slot is `sw $0, OFF($base)`,
|
||
that write retires while `core_pc` is *one cycle ahead*,
|
||
already showing `0xBFC4D388` (the helper's post-JAL instruction).
|
||
So Ch266 attributed three writes to PCs F32C/F330/F334 inside
|
||
the dispatcher, but the third write was actually F338 (the
|
||
JR delay slot), reported with PC=0xBFC4D388 because `core_pc`
|
||
sampling is one cycle late on memory events.
|
||
|
||
Confirmation: every "other" clearer at 0xBFC4D388 fires
|
||
*immediately after* a `CLEARER(disp)` from `0xBFC4F32C`
|
||
(see cyc=67019→67034, 67131, 68243 — 15-cycle gap between
|
||
the dispatcher write and the "helper" write, matching the
|
||
JR + delay-slot + pipeline-bubble timing). Three writes per
|
||
dispatcher call, distributed across what looks like two PCs
|
||
because of the same one-cycle skew the Ch266 closeout noted.
|
||
|
||
(Same skew explanation applies to PC=0xBFC4F334 in Ch266's
|
||
output — it was actually the JR delay slot's write at F338,
|
||
not a write from the JR itself.)
|
||
|
||
**Net:** there's still one writer (the dispatcher), three SWs
|
||
per call. The autopsy just gave us a clearer picture of which
|
||
PCs the writes are really attributed to.
|
||
|
||
### Zero pollers, zero non-zero writers — the gate is elsewhere
|
||
|
||
The crucial counts:
|
||
|
||
```
|
||
writer(non-zero) = 0
|
||
poller(read) = 0
|
||
```
|
||
|
||
**No read of `0xA000A8C8` happens anywhere in the model during
|
||
the BIOS-long run.** Combined with the disassembly of the
|
||
Ch217 outer-caller post-chain:
|
||
|
||
```
|
||
0xbfc52378: lui $v0, 0x1f80 ; <- clobbers $v0=0xA000A8C8
|
||
0xbfc5237c: ori $v0, $v0, 0x1070 ; $v0 now = 0x1F801070
|
||
0xbfc52380: sw $0, 4($v0) ; write 0 to I_MASK
|
||
0xbfc52384: jal <next-handler>
|
||
0xbfc52388: sw $0, 0($v0) ; write 0 to I_STAT (W1C ack)
|
||
```
|
||
|
||
…the outer caller **discards** `$v0=0xA000A8C8` immediately
|
||
after the chain returns and rebuilds it as `0x1F801070`
|
||
(IOP INTC I_STAT). The `0xA000A8C8` pointer is never used as
|
||
a polled value, never used as a data pointer, never used at
|
||
all by the outer caller.
|
||
|
||
The chain's job appears to be **pure side-effect** — clearing
|
||
the kernel struct at `0xA000A8C8` and updating internal
|
||
selector-keyed state via the helper (`$v1` return values were
|
||
selector-dependent). The chain's `$v0` is computed but
|
||
discarded.
|
||
|
||
## What this means for the search
|
||
|
||
**The polled gate is not at `0xA000A8C8`.** Ch263–Ch266 narrowed
|
||
the search to "the longjmp-return chain's effect," and Ch267
|
||
shows that effect is *not* a polled value at 0xA000A8C8 itself.
|
||
|
||
Possible relocations for "where the gate actually lives":
|
||
|
||
1. **One of the INTC writes the outer caller does immediately
|
||
after the chain.** `0xBFC52380: sw $0, 4($v0)` writes 0 to
|
||
I_MASK; `0xBFC52388: sw $0, 0($v0)` does W1C on I_STAT.
|
||
Both happen *every* Ch217 pass. Could the treadmill be
|
||
gated on the I_STAT value AFTER the W1C? If a "ready bit"
|
||
needed to remain set across the W1C, our INTC model might
|
||
be eating it.
|
||
|
||
2. **Elsewhere in the loop body the autopsies haven't covered.**
|
||
The Ch217 caller dump only shows PCs 0xBFC52340..0xBFC5238C
|
||
— the area *immediately* around the JAL. The treadmill
|
||
itself is longer; the polled state might be read further
|
||
along (post-W1C, post-RFE) before the exception loops back.
|
||
|
||
3. **A COP0 register, not memory.** The treadmill involves an
|
||
RFE; COP0 Status/Cause/EPC reads aren't in EE_MAP and
|
||
wouldn't show up in our existing autopsies. A re-poll of
|
||
Status.IE or Cause.IP between passes could be the gate.
|
||
|
||
## Recommendation for Ch268
|
||
|
||
**Pivot away from `0xA000A8C8` entirely.** Three concrete
|
||
follow-ups, in order of cheapest-first:
|
||
|
||
**(A) Widen Ch267 to scan ALL read EAs in the treadmill
|
||
window.** Instead of keying on one EA, capture every
|
||
non-fetch READ across a wider PC window — say the Ch217
|
||
caller body `0xBFC52340..0xBFC52400`. Bucket reads by EA and
|
||
diff pass 1 vs pass 8. Any EA that BIOS reads every pass and
|
||
whose value is "the same" deserves the polled-gate label.
|
||
Cheap to implement — copy the Ch266 capture, widen the PC,
|
||
drop the write capture, add per-pass diff bookkeeping.
|
||
|
||
**(B) Capture the immediate post-chain INTC writes.** Profile
|
||
the W1C cadence at I_STAT (0x1F801070) and I_MASK
|
||
(0x1F801074) across passes. If our INTC stub's behavior on
|
||
those writes differs from what BIOS expects, the treadmill
|
||
could be gating on I_STAT's residual after W1C.
|
||
|
||
**(C) Observe COP0 reads.** Add a minimal COP0 access logger
|
||
to ee_core_stub. Look for any read of Status/Cause/EPC that
|
||
returns the same value every pass — that's a candidate for a
|
||
"this would have changed on a real PS2" gate.
|
||
|
||
(A) is the highest-EV next step — it directly searches for
|
||
the gate without committing to a guess. (B) is the
|
||
second-highest-EV because we have a smoking gun pointing at
|
||
INTC (selector 0x05 → `$v1=0x1F801070`). (C) is the
|
||
fallback if (A) and (B) both come up empty.
|
||
|
||
**Do NOT proceed to Phase 2** (TB-poke of 0xA000A8C8). The
|
||
Ch267 result rules out 0xA000A8C8 as the gate, so poking it
|
||
would just confirm that — and possibly confuse the
|
||
dispatcher's internal selector-state tracking.
|
||
|
||
## Files changed
|
||
|
||
- `sim/tb/integration/tb_ee_core_bios_smoke.sv` — added
|
||
`\`ifdef CH267_GATE_OBSERVER` block. Single capture (R+W
|
||
for any EA matching phys 0x000A8C8 across aliases), with
|
||
per-event PC/value/role/post-clear tags. Stream-suppression
|
||
for dispatcher clears beyond first-per-pass. SUMMARY block
|
||
with alias breakdown + role counts. 5-way verdict logic
|
||
with alias-mismatch detection. Two call sites
|
||
(`ch267_print_observer()`) in halt + timeout exits.
|
||
- `sim/Makefile` — new `tb_ee_core_bios_long_gate_observer`
|
||
target (only `-DCH267_GATE_OBSERVER`).
|
||
|
||
## iverilog 12 quirks hit
|
||
|
||
None new. Wrote with the Ch264/265/266 patterns in mind
|
||
(no `return` from task; no bit-select on parenthesized expr;
|
||
`trace_pkg::` namespace). Clean first-try compile.
|
||
|
||
## Regression
|
||
|
||
Full regression: 157 / 157 with the new target off by default
|
||
(`CH267_GATE_OBSERVER` undefined for routine builds).
|
||
|
||
Standing by for Codex's Ch268 call. Recommendation: (A) —
|
||
wider PC-window read autopsy across the Ch217 caller body,
|
||
to find what EA the treadmill actually polls. The Ch266
|
||
infrastructure is reusable; just widen the PC window and
|
||
drop the write capture.
|