ec82764bef
RTL (GS rasterizer, EE core stub, platform bridge, LPDDR4B path), sim regression (272 TBs), docs, and tooling. Copyrighted PS2 content (BIOS, game code, GS dumps, and all dump-derived textures/traces) is excluded via .gitignore and stays local. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
511 lines
27 KiB
Markdown
511 lines
27 KiB
Markdown
# Wave 3 — BIOS / IOP / SBUS reconnaissance + probe arc (Ch256–Ch260)
|
||
|
||
**Status: CLOSED MILESTONE (Ch260).** The Ch256-Ch259 probe arc
|
||
exhausted single-register approaches to breaking the Ch215 longjmp
|
||
treadmill. The next BIOS attempt does NOT start with another hardcode
|
||
— it starts with **IOP-side state-machine modeling**. See "Milestone
|
||
closeout" section below for the full conclusion + Ch261+ pivot.
|
||
|
||
The earlier sections of this document are preserved for posterity;
|
||
the chapter-by-chapter notes below reflect what was actually learned
|
||
during the arc, not what we initially hypothesised.
|
||
|
||
---
|
||
|
||
## Milestone closeout (Ch260)
|
||
|
||
Four chapters of probe-arc work produced four concrete facts:
|
||
|
||
1. **Ch256 recon → "primary hypothesis: EE timers."** Wrong.
|
||
`tb_ee_core_bios_long` observer eventually surfaced the real
|
||
pattern — see Ch257.
|
||
2. **Ch257 Ch218 observer landed.** After seven iterations (six more
|
||
than necessary, lesson captured in
|
||
[[feedback-pause-for-codex-on-iteration-loops]]) the observer was
|
||
able to attribute reads to the JAL callee in the longjmp-return
|
||
path. Surfaced IOP DMAC PCR + IOP INTC pair as the recurring
|
||
polled MMIO targets. Timer hypothesis didn't survive contact.
|
||
3. **Ch258 IOP DMAC PCR realism stub (0xBF8010F0 → 0x07654321) —
|
||
PCR was not the gate.** The hardcode landed cleanly (verified via
|
||
`sim/traces/rtl/ee_bios_smoke_core.trace` showing
|
||
`lw $14, 0x10f0($14)` retiring with `$14 = 0x07654321`), BIOS
|
||
reads-then-writes the value back as part of a save/modify/restore
|
||
pattern, but the treadmill behaviour is byte-identical pre- vs.
|
||
post-Ch258.
|
||
4. **Ch259 named IOP INTC behavior at `0x1F801070`/`0x1F801074` +
|
||
sticky source injection — INTC alone is not enough.**
|
||
- Phase 1 (no synthetic source): proper W1C/mask semantics landed.
|
||
BIOS exercises the full 14-instruction INTC dance every pass:
|
||
probes I_MASK, W1C's I_STAT, polls I_STAT three times, sets up
|
||
mask bits 0 and 3, repeats. Every I_STAT read returns 0. Verdict:
|
||
`intc_quiet`. Treadmill persists.
|
||
- Phase 2 (`+IOP_INTC_BOOT_SRC=0001`): the sticky source IS reaching
|
||
the EE (verdict flipped to `intc_pending_observed` — at least one
|
||
I_STAT read returned non-zero). BIOS sees the pending bit but
|
||
still loops 8 passes with byte-identical retire count
|
||
(24,029,051). Pending bit alone is necessary but not sufficient
|
||
— the dispatch-through-handler doesn't reach a state where the
|
||
longjmp restoration sees changed inputs.
|
||
|
||
**Conclusion:** the Ch215 treadmill requires **multi-state
|
||
IOP/SBUS/kernel activity** — a real IOP responder that produces
|
||
firmware-visible side effects (kernel globals written, SIF mailbox
|
||
flags toggled, INTC sources asserted in response to actual events).
|
||
Single-register hardcodes and single-bit synthetic sources do not
|
||
suffice. Ch260 closes the BIOS-mmio probe arc.
|
||
|
||
### What landed in the tree (kept)
|
||
|
||
- `rtl/ee/ee_bootstrap_mmio_stub.sv` — three named MMIO behaviors
|
||
promoted out of the anonymous regfile, all in production-shaped
|
||
form with default-safe semantics:
|
||
- **0x1814** (Ch202) — hardcoded `0xFFFFFFFF` (RDRAM-init ready
|
||
polling bit).
|
||
- **0x10F0** (Ch258) — hardcoded `0x07654321` (IOP DMAC PCR realism
|
||
stub, real PS1/IOP reset value).
|
||
- **0x1070 / 0x1074** (Ch259) — named IOP INTC view with W1C on
|
||
I_STAT, plain-write on I_MASK, sticky `iop_intc_inject_src_i`
|
||
diagnostic injection port (default 0).
|
||
- `sim/tb/integration/tb_ee_core_bios_smoke.sv` — Ch218 observer
|
||
preserved as a compact INTC transaction log, gated behind
|
||
`\`ifdef CH259_INTC_DIAG` so routine builds are silent. Re-enable
|
||
via `make tb_ee_core_bios_long_intc_diag`.
|
||
- `sim/Makefile` — new `tb_ee_core_bios_long_intc_diag` target.
|
||
|
||
### Where the next attempt starts (Ch261+)
|
||
|
||
**Do not** open Ch261 as another `ee_bootstrap_mmio_stub` hardcode.
|
||
The probe-arc has demonstrated empirically that no single MMIO ready
|
||
bit is the gate. The Ch261+ arc opens **IOP-side modeling**:
|
||
|
||
- **Phase 1 (Ch261 candidate):** a tiny synthetic IOP/SIF responder
|
||
in the TB that produces ONE meaningful kernel-visible side effect
|
||
per Ch215 pass. The simplest plausible "real side effect" is a
|
||
monotonic counter at a fixed kernel-data address (e.g.,
|
||
`0x80030000` — the kdata region BIOS scans every pass per the
|
||
Ch218 v5 capture). If BIOS's longjmp callee polls that counter
|
||
and it advances, the treadmill should break. If it doesn't, we've
|
||
isolated which side-effect shape BIOS actually needs.
|
||
- **Phase 2 (later):** flesh out the IOP-side stubs (already exist
|
||
in `rtl/iop/`, none currently in production) into a responder
|
||
that can take SIF mailbox commands and emit INTC pulses. This is
|
||
the multi-chapter "real IOP" arc.
|
||
|
||
Three structural rules captured from the arc, to apply during
|
||
Ch261+ (recorded as memory entries):
|
||
- **Pause for Codex on the second unexpected result** —
|
||
[[feedback-pause-for-codex-on-iteration-loops]].
|
||
- **Agents for breadth, source-read for runtime semantics** —
|
||
[[feedback-agents-for-breadth-not-runtime]].
|
||
- **Don't model IOP/SIF activity by hardcoding a single bit; model
|
||
the producer.** A pending-bit hardcode kicks BIOS into a fake
|
||
handler path without progressing it.
|
||
|
||
---
|
||
|
||
**Purpose:** before re-opening the BIOS treadmill, lock down what the
|
||
EE / IOP / SIF / INTC / DMAC stack actually models today versus what the
|
||
real BIOS expects, so the next chapter targets a specific dependency
|
||
rather than chasing the BIOS opcode-of-the-week.
|
||
|
||
---
|
||
|
||
## TL;DR
|
||
|
||
The real BIOS, run under `tb_ee_core_bios_smoke` with `+BIOS=`, exhibits
|
||
**two distinct failure modes** that both blocked the Ch215-Ch218 arc:
|
||
|
||
1. **The Ch215 longjmp-return treadmill.** After the Ch215 jmp_buf
|
||
restore FSM rehydrates 12 GPRs from `0xA000B1E0`, BIOS resumes at the
|
||
restored `$ra` and loops 5 times through `0xBFC52340..0xBFC52360`. The
|
||
Ch217 verdict captures the smoking gun: across passes, `$a0`, `$a1`,
|
||
and the JAL callee's `$v0` are **bit-identical**. The kernel is
|
||
asking the same question and getting the same answer every cycle.
|
||
"BIOS has no escape signal from this callee" — external state that
|
||
should flip between passes is not flipping because our stack does
|
||
not model it.
|
||
|
||
2. **The DEADBEEF wedge in EE-RAM code.** Independent of the longjmp
|
||
loop, BIOS code executing from EE RAM at `pc=0x000014C4` issues an
|
||
`LW` with a base pointer derived from an earlier UNMAPPED read.
|
||
The first UNMAPPED read's return value (the EE map's sentinel for
|
||
unmapped regions) ends up in `$6`, the wedge fires `lw $2, 0($6)`
|
||
on a poisoned base, that read is also UNMAPPED, and the resulting
|
||
value keeps the loop alive forever. The TB has a
|
||
`lineage_poison_addr` / `retire_ring` diagnostic that captures
|
||
the first-unmapped read's PC + EA on every run.
|
||
|
||
Both failures share a root cause: **the EE memory map has decode holes
|
||
in regions BIOS reaches.** The treadmill is one symptom; the DEADBEEF
|
||
wedge is another. Ch257 picks the most likely missing region and lands
|
||
a minimal model.
|
||
|
||
---
|
||
|
||
## Layer inventory — what exists today
|
||
|
||
### SIF (Sub-system Interface) — `rtl/sif/`
|
||
|
||
All 8 SIF modules are functional in simulation, **none are
|
||
instantiated in the production hierarchy** (`top_psmct32_raster_demo`,
|
||
`de25_nano_psmct32_raster_demo_top`). They live in TBs only.
|
||
|
||
| Module | Models | Faked / TODO |
|
||
|---------------------------------|-------------------------------------------------------|---------------------------------------------------------------|
|
||
| `sif_mailbox_stub` | 4-reg MSCOM / SMCOM / MSFLG / SMFLG @ EE 0x1000F200 / IOP 0x1D000000 | No directional / W1C / set-clear semantics; no IRQ; plain RW |
|
||
| `sif_mailbox_peer_stub` | TB-side IOP responder: command-echo FSM watching MSFLG | TB only; no real IOP execution |
|
||
| `sif_dma_stub` | Qword receive endpoint (128-bit ready/valid) | No consume path; buffer fills and stays full |
|
||
| `sif_dma_ack_peer_stub` | EE→IOP combined ctrl+data terminator | One-shot S_DONE; no re-arm |
|
||
| `sif_dma_iop_ram_bridge_stub` | 128→32 width adapter, qword → 4×32-bit writes | DEST_BASE_ADDR hardcoded; no ack upstream |
|
||
| `sif_dma_ee_ram_bridge_stub` | 32→128 width adapter, accumulates 4 beats | DEST_BASE_ADDR hardcoded; Ch239 rewind for single-slot pads |
|
||
| `sif_dma_ee_ack_peer_stub` | IOP→EE combined ctrl+data terminator | One-shot |
|
||
| `boot_install_agent_stub` | Synthetic exception-handler streamer → EE RAM | NOT BIOS code; canned `MFC0 / ADDIU / JR / RFE` payload only |
|
||
|
||
**Critical gap:** the EE memory map has **no decode region for
|
||
`0x1000F200-0x1000F2FF`**, so `sceSifInit()`'s mailbox accesses go to
|
||
UNMAPPED. Even if the stubs were instantiated, the EE core could not
|
||
reach them. The IOP-side map *does* have the SIF decode (`0x1D000000`
|
||
block), but the IOP CPU is not running any real firmware.
|
||
|
||
### IOP — `rtl/iop/`
|
||
|
||
| Module | Models | Faked / TODO |
|
||
|-----------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------|
|
||
| `iop_core_stub` | Minimal R3000 (11 opcodes), COP0 Status/Cause/EPC triple-stack, async exception entry, `STRICT_UNSUPPORTED` trap+latch, reset vector `0xBFC00000` (BIOS) | No TLB, cache, HI/LO, R-type ALU, shifts, mul, div, BD bit, kernel/user enforcement |
|
||
| `iop_memory_map_stub` | IOP RAM 0-2 MiB, BIOS ROM 0x1FC00000+, IOP DMAC ch9 0x1F801520, IOP INTC 0x1F801070, SIF 0x1D000000, retroDE pad I/O 0x1F808500 | All other IOP I/O unmapped (SPU2, timers, other DMAC channels, real SIO2); UNMAPPED reads return `0xDEADBEEF` |
|
||
| `iop_ram_stub` | 2 MiB SRAM (default 16 KiB in TBs) | None |
|
||
| `iop_fetch_stub` | Sequential fetcher (trace-only; superseded by `iop_core_stub`) | — |
|
||
| `iop_exec_stub` | 5-opcode micro-op script engine (HALT/WRITE/READ/WAIT_IRQ/BNE; superseded by `iop_core_stub`) | — |
|
||
| `iop_dmac_reg_stub` | IOP DMAC ch9 (SIF0 IOP→EE): MADR/BCR/CHCR/DONE_COUNT, 32-bit beats from IOP RAM via memory master | Only ch9 wired; no error reporting |
|
||
| `sio2_input_stub` | Sony pad word @ retroDE 0x1F808500; 2-FF CDC from bridge domain | No real SIO2 FIFO at 0x1F808200; no DMAC ch11 |
|
||
|
||
**Critical gap:** **the IOP has no real BIOS image and never executes
|
||
beyond synthetic test programs.** Real BIOS boot expects the IOP to
|
||
fetch from `0xBFC00000` (shared BIOS ROM), execute its bootstrap, parse
|
||
`IOPBTCONF`, load IRX modules, and signal readiness back to the EE via
|
||
SIF. None of that is happening — the IOP-side mailbox writes that
|
||
would unstick the EE never come.
|
||
|
||
### INTC — `rtl/intc/intc_stub.sv`
|
||
|
||
Generic 16-source controller, reused on both EE (`0x1000F000`) and IOP
|
||
(`0x1F801070`) sides. W1C status, plain-write mask. Aggregate `cpu_irq
|
||
= |(STAT & MASK)`.
|
||
|
||
**Faked:** write-to-toggle (XOR) mask semantics not implemented (real
|
||
PS2 INTC uses XOR for atomic bit-flip — BIOS code that XORs a mask bit
|
||
will see unexpected behavior).
|
||
|
||
**Wired sources:** only DMAC completion pulses
|
||
(`dmac_reg_stub.irq_completion_o`, `iop_dmac_reg_stub.irq_completion_o`).
|
||
**Unwired:** VBLANK_START, VBLANK_END, GS_FINISH, TIMER0..3, SBUS,
|
||
VIF/VU, SIF0/SIF1 done, real SIO2, CDVD, USB, FW, IPU. The PCRTC's
|
||
`frame_seen` toggle exists in the demo wrapper but is not routed to
|
||
INTC.
|
||
|
||
### DMAC — `rtl/dmac/dmac_reg_stub.sv`
|
||
|
||
EE-side per-channel register shell + state machine. MADR / QWC / CHCR /
|
||
TADR / DONE_COUNT at offsets 0/0x10/0x20/0x30/0x40 within a channel
|
||
bank. Real 128-bit qword transfers from `ee_ram_stub` via memory
|
||
master.
|
||
|
||
**Faked:** chain mode (TADR recorded but never consulted), all errors
|
||
report code 0 (OK), no memory-contention throttling. Only ch2 (GIF)
|
||
and ch5 (SIF0) are instantiated in the production demo; channels 0, 1,
|
||
3, 4, 6-12 are register-only or absent.
|
||
|
||
### EE-side MMIO stubs — `rtl/ee/ee_*_mmio_stub.sv`
|
||
|
||
| Stub | Range | Behavior |
|
||
|----------------------------|--------------------------------|-----------------------------------------------------------------------------------------|
|
||
| `ee_biu_mmio_stub` | `0x1FFE0000-0x1FFE0FFF` (4 KiB) | Latched RW. Real BIOS writes 0x1FFE0130 for cache/BIU config (Ch9). |
|
||
| `ee_bootstrap_mmio_stub` | `0x1F800000-0x1F80FFFF` (64 KiB)| Latched RW + **one special-case**: offset `0x1814` returns hardcoded `0xFFFFFFFF` (Ch202 RDRAM-init ready bit). |
|
||
|
||
`ee_bootstrap_mmio_stub`'s Ch202 special-case is **the canonical
|
||
pattern** for "BIOS polls a hardware register expecting a ready bit;
|
||
real RDRAM init never happens; hardcode the return."
|
||
|
||
---
|
||
|
||
## The Ch215 longjmp-return treadmill — what the TB knows
|
||
|
||
`tb_ee_core_bios_smoke` runs the real BIOS dump (4 MiB at
|
||
`/home/ubuntu/Downloads/bios.hex`, passed via `+BIOS=` plusarg).
|
||
|
||
### What BIOS gets through
|
||
|
||
1. EE reset @ `0xBFC00000`, ROM bootstrap.
|
||
2. BIU/cache config writes to `0x1FFE0130` (BIU stub absorbs them).
|
||
3. `0x1F80xxxx` MCH/SBUS reads (bootstrap MMIO stub absorbs, special-
|
||
case `0x1814` returns `0xFFFFFFFF`).
|
||
4. Memory copies from BIOS ROM into EE RAM.
|
||
5. Kernel exception-handler install (handled by either
|
||
`boot_install_agent_stub` or direct memory writes).
|
||
6. **First SYSCALL #8 (`_ReturnFromException` with `$a0=2`)** —
|
||
triggers the Ch215 jmp_buf restore FSM. 12 GPRs (`$ra`, `$sp`, `$fp`,
|
||
`$s0..$s7`, `$gp`) are loaded from `0xA000B1E0`, `$v0` set to 1
|
||
(longjmp-return marker), PC restored from `$ra`.
|
||
|
||
### Where BIOS stalls
|
||
|
||
After the Ch215 restore, BIOS resumes at the restored `$ra` and loops
|
||
through `0xBFC52340..0xBFC52360`. The TB's Ch217 task captures every
|
||
pass through the JAL at `0xBFC52358` (the longjmp-return-handler call).
|
||
**Across 5 observed passes**:
|
||
|
||
- `$a0` and `$a1` to the callee are **bit-identical**.
|
||
- The callee's returned `$v0` is **bit-identical**.
|
||
|
||
Ch217 verdict (from the TB itself):
|
||
|
||
> `longjmp_return_repeats_due_to_static_state` — BIOS has no escape
|
||
> signal from this callee. The callee must be modifying something we
|
||
> are missing, OR the BIOS expects some external state (MMIO ready,
|
||
> timer, or kernel global) to flip between passes — but our model is
|
||
> not providing it.
|
||
|
||
The JAL target is computed dynamically by the TB
|
||
(`{4'hB, instr[25:0], 2'b00}`) and the callee's first 16 instructions
|
||
are dumped to the log via Ch217's `CALLED_FUNCTION dump`. **The
|
||
specific addresses the callee READS inside its body have not been
|
||
captured by the existing TB diagnostic.**
|
||
|
||
### The DEADBEEF wedge (separate, related)
|
||
|
||
Parallel to the longjmp loop, BIOS code running in EE RAM at
|
||
`pc=0x000014C4` issues an `LW` with base = `$6 = 0xDEADBEEF` (or a
|
||
derivative). The effective address `0x3084_FFFF` lands in an unmapped
|
||
region; the EE map returns the unmapped-sentinel; the value re-poisons
|
||
`$2`; the loop self-perpetuates millions of times.
|
||
|
||
The TB has full diagnostic for this: `lineage_poison_addr`,
|
||
`lineage_poison_data`, `lineage_pc`, `lineage_instr`, plus a 32-deep
|
||
retire ring (`retire_ring_pc[*]`, `retire_ring_r2[*]`,
|
||
`retire_ring_r6[*]`) that captures the 32 instructions preceding the
|
||
first UNMAPPED read. The lineage capture identifies where `$6`'s
|
||
poison originated, but the corresponding fix has not landed —
|
||
presumably because the ORIGINATING unmapped read's address class has
|
||
not yet been claimed by any stub.
|
||
|
||
---
|
||
|
||
## What's documented vs. what the kernel actually needs
|
||
|
||
Existing contracts under `docs/contracts/` are **all Draft**:
|
||
|
||
- `sif.md` — mailbox/flag exchange + SIF DMA shape; no detailed boot
|
||
sequence.
|
||
- `iop.md` — IOP as separate peer subsystem; notes IOPBOOT / module-
|
||
load NOT modeled.
|
||
- `memory.md` — emphasizes BIOS ROM mapping; explicitly does not own
|
||
BIOS boot sequencing.
|
||
- `intc.md`, `dmac.md`, `ee.md`, `gif_gs.md`, `peripherals.md`,
|
||
`platform.md`, `vif_vu.md`, `sio2_pad.md`, `spu2.md` — system
|
||
contracts, no BIOS/kernel detail.
|
||
|
||
The `wave2*_plan.md` documents are all DMA/GIF focused and explicitly
|
||
defer SIF/IOP/BIOS. **No existing doc covers the kernel's setjmp /
|
||
longjmp / `_ReturnFromException` path.** Ch214-Ch217 reverse-
|
||
engineered the layout (12-GPR frame at `0xA000B1E0`, setjmp at
|
||
`0xBFC4DB50`, post-setjmp checkpoint at `0xBFC52358`) but only
|
||
embedded the findings in the TB itself.
|
||
|
||
---
|
||
|
||
## Where the missing signal most likely lives
|
||
|
||
Cross-referencing the Ch217 verdict's three candidate categories
|
||
("MMIO ready, timer, or kernel global") against the layer inventory:
|
||
|
||
**Kernel global (RAM at `0xA000xxxx`).** Unlikely. The callee at
|
||
`0xBFC52358` is in BIOS ROM and reads `$a0`/`$a1` from the restored
|
||
GPRs. If the callee polls a kernel global, that global would be in EE
|
||
RAM. EE RAM is bidirectionally accessible; if the callee both READS
|
||
and WRITES the global, the value would change pass-to-pass. The Ch217
|
||
verdict says it does not — so either the callee writes nothing, or
|
||
the global lives in an unmapped region.
|
||
|
||
**Timer (`0x10000000-0x10001FFF`, T0/T1/T2/T3 COUNT/MODE/COMP/HOLD).**
|
||
Plausible. PS2 kernel uses one of T0-T3 for the scheduler tick. The
|
||
counter is read at fixed addresses; the count value advances with
|
||
hardware clocks regardless of CPU activity. **The EE memory map has
|
||
NO decode for `0x10000000-0x10001FFF`** — all four timers are
|
||
unmapped. A read returns the unmapped sentinel, the kernel reads the
|
||
same sentinel every pass, sees no time elapsed, loops.
|
||
|
||
**MMIO ready bit (EE INTC, SIF mailbox, or bootstrap MMIO).** Also
|
||
plausible.
|
||
- *EE INTC `0x1000F000`* is mapped (status/mask), but no sources fire
|
||
periodically. If the callee polls `INTC_STAT[VBLANK_START]` waiting
|
||
for a frame tick, the bit never sets.
|
||
- *SIF mailbox `0x1000F200`* is **not mapped at all** on the EE side.
|
||
A read returns the unmapped sentinel; if the callee polls
|
||
`SMFLG[IOP_READY]` waiting for IOP boot completion, that read is
|
||
pure UNMAPPED — and would also trigger the DEADBEEF wedge if the
|
||
sentinel is `0xDEADBEEF`.
|
||
- *Bootstrap MMIO `0x1F80xxxx`* is mapped (latched + Ch202 special-
|
||
case at `0x1814`). If the callee polls a different offset expecting
|
||
a ready bit, hardcoded special-case is the proven fix pattern.
|
||
|
||
---
|
||
|
||
## Ch257 — scoped callee-body memory-read observer (landed)
|
||
|
||
**Implementation:**
|
||
[`sim/tb/integration/tb_ee_core_bios_smoke.sv`](../sim/tb/integration/tb_ee_core_bios_smoke.sv)
|
||
gains a Ch218 observer + verdict-emitter:
|
||
|
||
- **`ch218_jal_target`** — dynamically decoded from
|
||
`peek_instr(0xBFC52358)` (the JAL whose callee Ch217 already
|
||
characterized as static-state). Pattern matches Ch217's own decode.
|
||
- **`ch218_pc_in_body`** — combinational gate, fires when the live
|
||
`core_pc` is in `[jal_target, jal_target + 0x80)`.
|
||
- **Capture array** of depth 64. Each entry records `{pass_idx (from
|
||
ch217_count), pc, instr, ea, data, rt}` per qualifying EE memory
|
||
READ event (`ee_map_ev_valid && EV_READ && SUBSYS_MEM &&
|
||
ch218_pc_in_body`). Sampled on every clock so reads within a single
|
||
pass through the callee are all captured.
|
||
- **Print task `ch218_print_callee_reads`** dumps every captured
|
||
read with its instruction mnemonic, then sweeps the array for an
|
||
EA that appears across **multiple distinct passes** and returns
|
||
**identical data**. That EA is the static-poll candidate.
|
||
- **Verdict classifier** with six labels, each one naming the
|
||
Ch258 target region directly:
|
||
|
||
| Verdict label | EA range | Ch258 action |
|
||
|------------------------------|--------------------------------|-----------------------------------------------------------|
|
||
| `timer_poll_static` | `0x10000000 – 0x10001FFF` | Land `ee_timer_stub` (PRIMARY) |
|
||
| `sif_mailbox_static` | `0x1000F200 – 0x1000F2FF` | Route `sif_mailbox_stub` ee-side into EE map decode |
|
||
| `ee_intc_static` | `0x1000F000 – 0x1000F1FF` | Wire VBLANK / TIMER / SIF source(s) into `intc_stub.irq_src` |
|
||
| `bootstrap_mmio_static` | `0x1F800000 – 0x1F80FFFF` | Add Ch202-style hardcoded ready at the polled offset |
|
||
| `ee_ram_static` | `0x00000000 – 0x01FFFFFF` | Preload kernel global via `boot_install_agent_stub` |
|
||
| `named_region_static` | (any other range) | Report verbatim; pick the next Ch258 hypothesis |
|
||
| `no_repeated_read_across_passes` | — | Not enough passes captured, or callee uses scratch only |
|
||
| `no_callee_reads` | — | Synthetic-CI mode (callee never reached); not applicable |
|
||
|
||
- **Wiring:** the print task is called from both the long-run halt
|
||
path (after `ch217_print_longjmp_path`) and the timeout path (the
|
||
one real-BIOS mode exits through). Synthetic CI mode skips both
|
||
call sites because they sit inside `ch213_sc8_seen`-gated blocks —
|
||
no regression noise, no behavior change to existing TBs.
|
||
|
||
**No RTL change.** No production-RTL change. No new TB. Pure TB-side
|
||
diagnostic add to an existing TB. The full sim regression stays at
|
||
**155 PASS / 0 FAIL** with the observer dormant in every TB except
|
||
the one real-BIOS run that the operator triggers manually.
|
||
|
||
## How to use Ch218 — operator command
|
||
|
||
```
|
||
cd sim
|
||
make tb_ee_core_bios_smoke BIOS=/home/ubuntu/Downloads/bios.hex
|
||
```
|
||
|
||
This runs `tb_ee_core_bios_smoke` with `+BIOS=...` so the real
|
||
4 MiB BIOS dump is loaded. The TB will eventually hit the Ch215
|
||
longjmp treadmill, loop the Ch217 callee multiple times, and
|
||
timeout. The timeout-path prints:
|
||
|
||
```
|
||
[ch217] LONGJMP_PATH_DECODE 0xBFC52350..0xBFC52390:
|
||
[ch217] ...
|
||
[ch217] verdict=longjmp_return_repeats_due_to_static_state ...
|
||
[ch218] CALLEE_BODY_READS jal_target=0x?????.??..0x?????.?? captured=N (cap=64)
|
||
[ch218] [0] pass=0 pc=0x... instr=0x... ea=0x... data=0x... rt=$. <mnemonic>
|
||
[ch218] [1] pass=0 pc=0x... ...
|
||
[ch218] ...
|
||
[ch218] verdict=<label> (...)
|
||
```
|
||
|
||
Report the `[ch218] verdict=` line. Ch258 picks the table row.
|
||
|
||
## Ch258 proposal — one narrow next state to model
|
||
|
||
**Driven by the Ch218 verdict; no decision required before observer
|
||
data lands.**
|
||
|
||
**Primary hypothesis: EE Timer block `0x10000000-0x10001FFF`.** PS2 BIOS
|
||
kernels rely heavily on T0-T3 for scheduler ticks; the entire region
|
||
is completely unmapped on the EE side (no decode entry in
|
||
`ee_memory_map_stub`). Both the longjmp treadmill AND the DEADBEEF
|
||
wedge are explained by a single root cause if the kernel polls a timer
|
||
count and the unmapped read returns the sentinel that becomes the
|
||
DEADBEEF poison.
|
||
|
||
**If Ch218 verdict = `timer_poll_static`:** land
|
||
`rtl/ee/ee_timer_stub.sv` with 4 channels × {`T_COUNT`, `T_MODE`,
|
||
`T_COMP`, `T_HOLD`} at base addresses `0x10000000`, `0x10000800`,
|
||
`0x10001000`, `0x10001800`. `T_COUNT` is RO and ticks at parameterized
|
||
rate (default: design_clk / 16 ≈ 3 MHz at 50 MHz, matching PS2 BUSCLK
|
||
/ 16). Writes latch `T_MODE`/`T_COMP`/`T_HOLD`. No IRQ wired yet (
|
||
deferred to Ch259 if needed). Wire into `ee_memory_map_stub` between
|
||
the BIU and bootstrap MMIO decode regions.
|
||
|
||
**If Ch218 verdict is anything else:** follow the table in the Ch257
|
||
section above. The verdict label names the Ch258 action directly.
|
||
|
||
**Why not chase SIF/IOP first:** the SIF gap is real and known but
|
||
it's a multi-chapter project (EE decode → mailbox in production
|
||
hierarchy → IOP BIOS → IOP CPU exercises → mailbox writes land). The
|
||
timer gap is a single self-contained stub with one parameter (tick
|
||
rate) and four registers per channel. If observation shows the callee
|
||
is a timer poll, this unsticks the treadmill in one chapter. If
|
||
observation shows otherwise, the named region is the Ch258 target.
|
||
|
||
**Acceptance for Ch258 (whichever path):**
|
||
|
||
1. The Ch218 verdict line drives the chapter choice — no judgment
|
||
call without data.
|
||
2. The chosen stub lands cleanly + the full regression stays
|
||
155 PASS / 0 FAIL.
|
||
3. Running `tb_ee_core_bios_smoke +BIOS=...` produces observable
|
||
forward motion: either Ch217's caller-passes count diverges (BIOS
|
||
exited the loop), or a new UNMAPPED region surfaces (BIOS made
|
||
progress but hit the next gap).
|
||
|
||
---
|
||
|
||
## Open questions deferred to Ch258+
|
||
|
||
- IOP BIOS execution: does the IOP need a real BIOS dump, or is a
|
||
synthetic IOP bootloader plus `IOPBTCONF` parsing enough?
|
||
- Real PS2 INTC XOR mask semantics: BIOS code that XORs a mask bit
|
||
will misbehave on plain-write `intc_stub`. Lower priority — wait
|
||
until BIOS demonstrably hits it.
|
||
- VBLANK source wiring (PCRTC `frame_seen` toggle → `intc_stub.irq_src`):
|
||
near-term if the treadmill turns out to be VBLANK-driven, but
|
||
defer until the observer confirms.
|
||
- SIF mailbox in production hierarchy: instantiate
|
||
`sif_mailbox_stub` alongside `intc_stub` at the EE-side
|
||
`0x1000F200` decode region; ALSO wire IOP-side EE-write port
|
||
through to the IOP's memory map for bidirectional handshake.
|
||
- Kernel global modeling: if the callee turns out to poll a RAM
|
||
address that some IOP-side write should mutate, that's the
|
||
trigger to land the IOP in production hierarchy (also a multi-
|
||
chapter project).
|
||
|
||
---
|
||
|
||
## Files referenced
|
||
|
||
**RTL:**
|
||
- `rtl/sif/sif_mailbox_stub.sv` (all 8 SIF stubs in `rtl/sif/`)
|
||
- `rtl/iop/iop_core_stub.sv`, `iop_memory_map_stub.sv`,
|
||
`iop_dmac_reg_stub.sv`, `sio2_input_stub.sv`
|
||
- `rtl/intc/intc_stub.sv`
|
||
- `rtl/dmac/dmac_reg_stub.sv`
|
||
- `rtl/ee/ee_biu_mmio_stub.sv`, `ee_bootstrap_mmio_stub.sv`,
|
||
`ee_memory_map_stub.sv`, `ee_core_stub.sv` (Ch215 jmp_buf FSM,
|
||
STRICT_UNSUPPORTED latch)
|
||
|
||
**TBs:**
|
||
- `sim/tb/integration/tb_ee_core_bios_smoke.sv` (Ch214/216/217
|
||
diagnostics, lineage capture, DEADBEEF wedge characterization)
|
||
- `sim/tb/integration/tb_iop_core_bios_smoke.sv` (IOP-side mirror,
|
||
STRICT_UNSUPPORTED)
|
||
|
||
**Real BIOS dump:**
|
||
- `/home/ubuntu/Downloads/bios.hex` (9 MiB ASCII, ~4 MiB binary,
|
||
`$readmemh`-ready)
|