Files
retroDE_ps2/docs/ch268_closeout.md
thejayman77 ec82764bef Initial commit: retroDE_ps2 — first-of-its-kind PS2 GS FPGA core (DE25-Nano / Agilex 5)
RTL (GS rasterizer, EE core stub, platform bridge, LPDDR4B path), sim regression
(272 TBs), docs, and tooling. Copyrighted PS2 content (BIOS, game code, GS dumps,
and all dump-derived textures/traces) is excluded via .gitignore and stays local.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-29 20:10:50 -04:00

6.2 KiB

Ch268 closeout — outer caller body emits ZERO non-fetch reads

Status: Closed. The widened read autopsy across the longjmp-return OUTER CALLER body (PC 0xBFC52340..0xBFC52400) captured zero non-fetch reads in the entire BIOS-long run.

Verdict: outer_no_reads.

By inspection of the Ch217 outer-caller dump, this is not a bug — the body really doesn't issue any loads:

0xBFC52350: beq  $v0, $0, +0xC      ; conditional branch  ← THE DECISION
0xBFC52354: nop
0xBFC52358: jal  <Ch264 callee>
0xBFC5235C: addiu $a0, $0, 0x385
0xBFC52360: jal  <helper directly>
0xBFC52364: addiu $a0, $0, 0x07
0xBFC52368: jal  <handler3>
0xBFC5236C: nop
0xBFC52370: jal  <handler4>
0xBFC52374: addiu $a0, $0, 0x08
0xBFC52378: lui  $v0, 0x1F80
0xBFC5237C: ori  $v0, $v0, 0x1070
0xBFC52380: sw   $0, 4($v0)        ; W I_MASK
0xBFC52384: jal  <handler5>
0xBFC52388: sw   $0, 0($v0)        ; W I_STAT
0xBFC5238C: lui  $a0, 0xBFC6

No lw/lb/lh anywhere. Only beq, nop, jal, addiu, lui, ori, sw. The outer caller body is entirely made of control-flow + immediate compute + JALs + writes — no memory reads to gate on.

What this means

The BEQ at 0xBFC52350 is testing $v0 == 0. Per Ch217: $v0_pre = 0x00000001 every Ch217 pass — i.e. the condition $v0 != 0 always holds, the branch is never taken, and the JAL chain always runs.

The actual gate is whatever sets $v0 BEFORE PC=0xBFC52350.

Crucially, this means:

  • The gate is outside the autopsy window we just scanned.
  • The gate is the instruction (or sequence) that computes $v0 before the BEQ — almost certainly a load from somewhere, or a function return that propagates a memory read upward.
  • If something could set $v0 = 0 between Ch217 passes, the BEQ would TAKE, BIOS would skip the entire JAL chain (and the post-chain INTC clears), and execution would diverge — i.e. the treadmill would break.

Codex Ch268 acceptance — line-by-line

Codex requirement Status Where
Observe 0xBFC52340..0xBFC52400 CH268_OUTER_LO/HI
Capture non-fetch data reads only EV_READ + !is_fetch predicate
Bucket by EA AND alias-normalized phys ch268_phys[i] = ee_map_ev_arg0[28:0]; dedup keyed on phys
Per-bucket: hits, PCs, per-pass values, data-varies, region DISTINCT_PHYS_EAs report (would have fired with non-zero captures)
Pass index isolated (pass 0 vs 1..8) pass= column + gate logic excludes pass 0
Ignore stack reads + saved-register reloads ch268_ea_is_stack() using $sp captured at JAL site
5-way verdict outer_static_{ram,mmio}_gate_found / only_stack / no_reads / vary
Regression unaffected 157 / 157 with target off-by-default
Don't jump to INTC semantics yet Did not touch INTC stub or jump to assumptions

Files changed

  • sim/tb/integration/tb_ee_core_bios_smoke.sv — added \ifdef CH268_OUTER_READ_AUTOPSYblock. Captures: per-event ($pass/PC/EA/phys/data/region); per-pass $sp (so the stack filter can be per-pass-accurate). Print task with: stream, alias-normalized bucketing, per-bucket PC tracker (up to 4), per-bucket per-pass value table, alias-mask, 5-way verdict. Twoch268_print_autopsy()` call sites (halt + timeout exits).
  • sim/Makefile — new tb_ee_core_bios_long_outer_read_autopsy target (only -DCH268_OUTER_READ_AUTOPSY).

iverilog 12 quirks hit

None new. Used flat 1D arrays (with bucket*SLOTS+k indexing) to avoid 2D-unpacked-array surprises. Same pattern that Ch264/265/266/267 used. Clean first-try compile.

Recommendation for Ch269

Trace back to where $v0 gets set BEFORE the BEQ.

The autopsy framework worked exactly as designed — it correctly reported zero reads, because there genuinely are zero reads in the scanned window. The structural lesson is that the gate is upstream of 0xBFC52350.

Three concrete next steps, in order of cheapest:

(A) Widen the PC window backwards. Re-run Ch268 with CH268_OUTER_LO = 0xBFC52300 (or 0xBFC52280) to cover the predecessor block of the BEQ. The instruction sequence leading INTO 0xBFC52350 almost certainly includes the load or compute that produces the $v0=1 value. Same observer, zero changes other than the PC window. Cheap.

(B) Track all writes to $v0 (regfile[2]) inside the treadmill. Add a tap on u_core.regfile[2] and log every cycle it changes, with the retiring PC and core_ev_valid. Filter to the treadmill window (post-Ch217-pass-0). The last write to $v0 BEFORE PC=0xBFC52350 is the producer we want to identify. Slightly more surgical than (A) but needs more wiring.

(C) Trace back from the function entry. The function containing 0xBFC52350 has an entry point somewhere earlier — usually preceded by a JR/JALR/J that crossed into it. Reading the BIOS dump near 0xBFC52340 and walking backward to find the prologue (addiu $sp,$sp,-N; sw $ra,...) identifies the function bounds; then Ch269 can autopsy the whole function.

(A) is the highest-EV. If the predecessor block contains a load, that's the gate. If it contains only register-to-register moves, we need (B) or (C) to trace back further. Either way, the search has narrowed dramatically — the gate is now a well-bounded "find what set $v0 before 0xBFC52350" question.

Standing by for Codex's Ch269 call.

One subtle note: the BEQ is testing $v0 == 0. If we ever find the producer and want to perturb it, setting $v0 = 0 between passes (e.g. by writing 0 to whatever memory the producer reads) should break the treadmill. That's a clean hypothesis test.

Regression

Full regression: 157 / 157 with the new target off by default (CH268_OUTER_READ_AUTOPSY undefined for routine builds).