RTL (GS rasterizer, EE core stub, platform bridge, LPDDR4B path), sim regression (272 TBs), docs, and tooling. Copyrighted PS2 content (BIOS, game code, GS dumps, and all dump-derived textures/traces) is excluded via .gitignore and stays local. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
41 KiB
SIO2 / pad input contract
Status: Draft / partial impl (Ch233 recon + Ch234 Option-A implementation
landed). RTL: rtl/iop/sio2_input_stub.sv.
Successor chapters (Ch235+) extend to analog / SIF mailbox / faithful SIO2.
Ch234 implementation (landed)
sio2_input_stub.sv is the Option-A surface from the recon below. It
sits inside iop_memory_map_stub and translates the bridge-domain
INPUT_P1 / INPUT_P2 bitmaps into a Sony-format 16-bit digital pad
word readable from the IOP-side MMIO bus.
IOP MMIO surface (retroDE-local, not Sony-compatible):
| Offset | Reg | Layout |
|---|---|---|
0x1F80_8500 |
PAD_P1_STATE |
[7:0]=byte3 (D-pad/start/select/sticks), [15:8]=byte4 (face/shoulder), [31:16]=0 |
0x1F80_8504 |
PAD_P2_STATE |
Same shape, sourced from INPUT_P2 |
0x1F80_8508 |
PAD_STATUS |
[0]=present/valid=1, [31:1]=0 |
| other | reserved | Read 0; write accepted-and-ignored |
CDC: 2-FF synchronizer per bit on each of the 32-bit INPUT_P1
and INPUT_P2 inputs. Bridge writes at retrodesd's ≤ 1 kHz rate are
millions of design-clock cycles apart, so partial-bit tearing during
the sync settling window is theoretically possible but practically
vanishingly rare. A future chapter can promote to "snapshot CDC"
(latch + 2-sample coherency) if tearing ever becomes observable.
Active-high → active-low: each INPUT_P1 bit equal to 1 (pressed)
maps to the corresponding Sony bit equal to 0 (pressed). Two
combinational ~{...} assigns do the per-bit permutation +
inversion in one cycle each.
Coverage:
sim/tb/iop/tb_sio2_input_stub.sv
exercises the new module directly (without going through the IOP
map): reset state (all reads 0 except PAD_STATUS); no-buttons →
Sony word 0xFFFF; single-bit pressed across all 16 retroDE bits;
JOY_OSD (bit 16) deliberately not forwarded; combos (START+SELECT,
face+D-pad); P1/P2 independence with distinct patterns; writes
accepted-and-ignored; out-of-range word offsets read 0; clearing
returns to 0xFFFF. 152 PASS sim regression intact (151 baseline
- new TB).
The iop_memory_map_stub now also routes the new region in its
read-response mux and trace; CPU reads to addresses in
0x1F80_8500..0x1F80_85FF route to the stub, others fall through
unchanged. Sixteen existing IOP-map-consuming TBs gained a
.input_p1(32'd0), .input_p2(32'd0) tie-off since the map signature
gained two new input ports.
Bridge-side output ports landed in Ch235. ps2_hps_bridge now
exposes input_p1_o / input_p2_o as bridge-clock-domain
broadcasts of the Ch222 latches; iop_memory_map_stub.input_p1 /
input_p2 consume them directly. The board top wires the bridge's
new outputs to a pair of local bridge_input_p1 / bridge_input_p2
nets (unconnected for now — the synth top doesn't yet instantiate
the IOP core, but the wires are placed for future hookup).
The full HPS → bridge → IOP path is sim-validated end-to-end by
sim/tb/integration/tb_bridge_iop_pad_input.sv:
two distinct clocks (100 MHz bclk for the bridge, 33 MHz iclk for
the IOP map) so the bridge-clk → IOP-clk CDC inside the
sio2_input_stub is genuinely exercised. The TB drives AXI writes
into INPUT_P1/P2 at the standard 0x040/0x044 offsets and reads
PAD_P1_STATE/PAD_P2_STATE at 0x1F80_8500/0x1F80_8504 — exactly the
operator-visible end-to-end flow.
Ch237 — EE-visible pad-state buffer (recon)
Status: Recon (no RTL). Defines how the IOP-local Sony pad word
(Ch234) becomes an EE-readable 16-byte buffer that libpad-shaped
code can consume.
Why this recon exists
Ch234 gave PS2-side IOP code access to a Sony-format pad word. Ch235 wired the HPS→IOP half on real (sim) silicon. But the EE half — how EE-side software (eventually libpad, or hand-rolled homebrew) sees pad state — is still undefined. Ch237 picks a shape before Ch238 starts soldering RTL.
Survey: SIF infrastructure that already exists
The SIF seam is feature-complete for staged bring-up per
rtl/sif/README.md. Relevant
already-landed pieces for the pad-state path:
| Module | What it does |
|---|---|
sif_mailbox_stub |
4-register mailbox: MSCOM / SMCOM / MSFLG / SMFLG. Both EE-side and IOP-side ports. |
sif_dma_iop_ram_bridge_stub |
EE→IOP DMA: 128-bit qword → 4×32-bit IOP RAM writes (with DEST_BASE_ADDR). |
sif_dma_ee_ram_bridge_stub |
IOP→EE DMA: 4×32-bit IOP beats → 1×128-bit EE-RAM write at DEST_BASE_ADDR. Has last_seen_o. |
sif_dma_ack_peer_stub |
Mailbox doorbell + payload-complete combiner (EE side waits). |
sif_dma_ee_ack_peer_stub |
IOP-driven equivalent (mirror polarity). |
boot_install_agent_stub |
EE-driven boot-image landing through SIF (different traffic shape but same primitives). |
The IOP→EE data path already exists in RTL form. A 16-byte pad-state buffer arriving at a fixed EE-RAM address is one sif_dma_ee_ram_bridge transaction — exactly four 32-bit beats. The protocol-combiner peers handle the "payload landed, notify the other side via mailbox flag" sequence both ways.
What does NOT exist today
- EE-side SIF register decode in
ee_memory_map_stub. Real PS2 has SIF MSCOM/SMCOM/MSFLG/SMFLG visible to the EE at0x1000_F200..0x1000_F2FF; the EE map doesn't yet decode that range.sif_mailbox_stubhas an EE-side port, but no EE map routes CPU reads/writes there yet. (The IOP-side map decodes its own SIF window at0x1D00_0000+.) - No EE-side execution primitive in the synth top. Same
silicon-truth caveat as the IOP side from Ch236 —
tb_*TBs exercise EE↔IOP coordination in sim with real EE/IOP CPU stubs, but the synth top doesn't instantiate either. The path can land in sim and stay sim-only until a future top-integration chapter wires both CPUs in. - No libpad / padman RPC layer. Real PS2: padman.irx on IOP receives RPC calls from EE-side libpad, services them with SIF DMAs back to EE buffers. The RPC layer is software on both sides, not RTL. Ch237 scope is the RTL-level buffer-delivery path — the RPC protocol on top can come later.
Three options for the EE-visible surface
Option A — IOP→EE DMA into a fixed EE-RAM buffer (recommended)
Shape: IOP code reads PAD_P1_STATE / PAD_P2_STATE
(Ch234), constructs a 16-byte Sony pad-state struct in IOP RAM,
DMAs it via sif_dma_ee_ram_bridge_stub to a fixed address in
EE RAM (e.g., EE_PAD_BUFFER_BASE = 0x0008_0000). EE-side code
reads from that address.
Pros:
- Uses the existing
sif_dma_ee_ram_bridge_stubas-is. - Matches the shape libpad expects — pad state lands in EE-allocated memory, EE reads bytes directly.
- The fixed address is a stub convention; a future libpad layer can carry the real per-port allocation address.
- 16 bytes = exactly four 32-bit SIF DMA beats = exactly one qword write at the EE-RAM bridge. No partial-quad edge cases.
Cons:
- Requires an IOP-side execution context that reads
PAD_P1_STATE and drives the DMA — but Ch235's
tb_bridge_iop_pad_inputis the template; we already have small synthetic-IOP-code patterns intb_iop_*TBs. - The DMA path has ack/handshake latency (mailbox doorbell + 4-beat DMA + completion flag). For Ch238's first stub this is fine; for real-time pad polling at 60 Hz it's also more than fine (each transaction is microseconds at typical clock rates).
Option B — Mailbox register packing (smallest possible)
Shape: IOP packs the 16-byte pad state into the 4×32-bit
mailbox registers (MSCOM / SMCOM / MSFLG / SMFLG).
EE reads them via the (not-yet-decoded) EE-side SIF window.
Pros:
- No DMA, no payload completion. Just register writes.
- Even smaller scope than Option A — could be one TB chapter.
- Mailbox storage already exists.
Cons:
- Overloads mailbox semantics: real PS2 uses MSFLG/SMFLG as flag/doorbell registers, not data carriers. A naive stub here breaks any future mailbox-based RPC protocol.
- Not libpad-compatible at all. Real libpad never reads pad state from SIF mailbox registers — it reads from a DMA-populated EE-RAM buffer. Option B would require all EE-side code to use a PS2-local convention.
- Still requires EE-side SIF window decode, so the "small" advantage shrinks once the EE map work is needed anyway.
Option C — retroDE-local EE MMIO (mirror IOP-side stub)
Shape: Add a pad_input_ee_stub in the EE map at a
retroDE-local address (e.g., 0x1B00_8500 deliberately
outside any real PS2 region). Combinationally surface the
same Sony pad words the IOP-side stub exposes.
Pros:
- Zero protocol overhead — combinational mirror, single register read.
- No SIF involvement, no DMA, no handshake.
- Symmetric with Ch234's IOP-side pattern.
Cons:
- Doubles the platform-local surface — two non-Sony register windows (IOP + EE) doing the same thing.
- Bypasses SIF entirely, so it doesn't exercise the EE↔IOP path that libpad / real games actually use.
- Doesn't help with eventual SIF/RPC compatibility — when Option A lands, Option C becomes dead code.
Recommendation
Option A for the substantive next chapter. Reasoning:
- The existing
sif_dma_ee_ram_bridge_stubalready implements "IOP-side 4 beats → 1 qword EE-RAM write at a known address". Reusing it costs zero new RTL on the data path. - The shape matches libpad's expected dataflow, so future RPC work composes cleanly without semantic refactoring.
- The fixed-address convention is a single parameter; a real libpad layer can override it per port without changing the RTL surface.
Option B is tempting for "fastest visible EE-side proof" but breaks libpad-shape; Option C is tempting for symmetry but creates dead code once Option A lands.
Where the path lights up in existing stubs
For a sim-only Ch238 (Option A), the data flow is:
sio2_input_stub.PAD_P1_STATE // Ch234 — IOP reads here
│
▼ (IOP-side test code: read, copy to IOP RAM)
iop_ram (16 bytes at iop_pad_buffer_addr)
│
▼ IOP DMAC → sif_dma_iop_ram_bridge_stub egress // EXISTS
sif_dma_stub (EE-side ingress buffer) // EXISTS
│
▼ sif_dma_ee_ram_bridge_stub → ee_memory_map.bridge_wr // EXISTS
ee_ram (16 bytes at EE_PAD_BUFFER_BASE) // EXISTS
│
▼ EE-side test code: cpu_rd from EE_PAD_BUFFER_BASE
EE-readable pad state ← target
The only new pieces needed are:
- A small IOP-side test harness that drives the read → DMA sequence (TB-level glue or a tiny synthetic-IOP-code fragment loaded into IOP RAM).
- A new integration TB that wires all the existing stubs
end-to-end and asserts an EE-side read of
EE_PAD_BUFFER_BASEmatches the Sony pad word from PAD_P1_STATE within some bounded latency.
No new RTL module is strictly required for Ch238 — the path composes from existing primitives. If the integration TB turns up a missing piece (e.g., a more convenient pad-state packing helper), that's a candidate for new RTL; otherwise Ch238 lands as one new TB plus possibly one tiny helper.
Proposed chapter sequence
| Ch | Scope |
|---|---|
| Ch238 | Integration TB. Wires the existing IOP map (with Ch234 sio2_input_stub) + IOP DMAC + SIF mailbox + SIF DMA primitives + EE map → IOP-side test sequencer reads PAD_P1_STATE, packs a 16-byte Sony struct into IOP RAM, kicks an IOP→EE SIF DMA, signals via mailbox flag, then EE-side TB code reads the buffer at EE_PAD_BUFFER_BASE and verifies the bytes. End-to-end latency expected: ≤ a few microseconds at the existing clock rates. |
| Ch239 | EE-side read surface polish: decode the SIF MSCOM/SMCOM window in ee_memory_map_stub (it currently doesn't decode SIF — fixing that lets the EE CPU stub poll the mailbox pad-ready flag without TB intervention). Optionally a tiny EE-side test program loaded into EE RAM that does lw $v0, EE_PAD_BUFFER_BASE and traces the result. |
| Ch240+ | Real padman/libpad RPC compatibility: define the RPC frame format, build the EE-side request/IOP-side response pair, support multi-port + connected/disconnected state changes. Largest single chapter in the input arc — defer until Ch238+Ch239 are green and there's a real game/BIOS workflow demanding it. |
Out of scope for Ch237 / Ch238 / Ch239
- Analog stick fidelity (still digital-mode-only at all three Ch222 / Ch234 / Ch238 levels).
- DualShock 2 pressure-sensitive buttons.
- Multitap support.
- Vibration / actuator feedback (output direction).
- Faithful SIO2 protocol emulation at
0x1F80_8200..0x1F80_82FF(deferred per Ch233 / Ch234 reasoning). - Top-level synth integration of the IOP and EE cores. Until that lands, Ch238+ are sim-only chapters; the silicon-side story stays the Ch236 disclaimer ("non-zero INPUT_P1 values mean the bridge latch landed, NOT that PS2 code saw it").
Boundary call
The existing SIF DMA + mailbox infrastructure already implements the IOP→EE data delivery path; Ch238 only needs to compose those primitives with a small IOP-side test sequencer and define
EE_PAD_BUFFER_BASE. Real libpad/ padman compatibility is a software layer on top of that path, not a separate RTL surface; Ch240+ work, post-MVP for the input arc.
Ch238 implementation (landed)
Option A is proven end-to-end in sim with no new production RTL — the path composes entirely from existing primitives.
New integration TB
sim/tb/integration/tb_pad_state_via_sif_to_ee.sv:
| Stage | Module |
|---|---|
| HPS AXI write | TB drives bridge's AXI4 slave |
| Bridge latch | ps2_hps_bridge (Ch222 INPUT_P1) |
| Bridge→IOP CDC | sio2_input_stub (Ch234 inside IOP map) |
| IOP read of pad word | TB-side IOP read at 0x1F80_8500 |
| 16-byte pad packet | TB packs Sony struct (status/type/token/byte3/byte4 + analog centers 0x80) |
| 4-beat SIF DMA | TB drives sif_dma_ee_ram_bridge_stub.in_* |
| EE-RAM landing | ee_memory_map_stub.bridge_wr_* → ee_ram_stub |
| EE-side verification | TB issues DMAC qword read at landing addr |
Two clocks (100 MHz bridge, 33 MHz IOP/EE/SIF) so the
bridge-clk → IOP-clk CDC inside sio2_input_stub is genuinely
exercised end-to-end.
Pad packet layout (16 bytes, packed into 4 little-endian 32-bit beats):
byte 0 : 0x00 success status
byte 1 : 0x41 response type (digital mode)
byte 2 : 0x5A success token
byte 3 : Sony byte3 D-pad/start/select/sticks (active-low)
byte 4 : Sony byte4 face/shoulder (active-low)
bytes 5–8 : 0x80 RX/RY/LX/LY analog centers (digital mode)
bytes 9–15: 0x00 reserved (DualShock 2 pressure)
Verified scenarios:
| § | INPUT_P1 (AXI write to 0x040) | Expected Sony bytes 3/4 |
|---|---|---|
| §1 | 0x00000000 (no buttons) |
byte3=0xFF, byte4=0xFF |
| §2 | 0x00000001 (JOY_RIGHT only) |
byte3=0xDF (bit 5 cleared), byte4=0xFF |
| §3 | `0x00000031 | (1<<6)` (RIGHT+START+SEL+△) |
| §4 | 0x00000000 (re-clear) |
byte3=0xFF, byte4=0xFF |
The TB also confirms last_seen_o rises after each 4-beat
burst (proves the in_last semantics propagate cleanly through
the egress bridge's state machine).
Streaming-bridge note (timing artifact, not a bug): the
existing sif_dma_ee_ram_bridge_stub advances wr_offset by
16 after every emit (streaming semantics — designed for
multi-qword DMAs). Successive scenarios in this TB therefore
land at successive 16-byte slots; the TB tracks the per-scenario
landing address (EE_PAD_BUFFER_BASE + scenario_idx * 16) and
verifies the byte layout at each. A real libpad/padman
implementation will need either (a) a bridge-reset between
transfers so every padRead() overwrites the same buffer, or
(b) an SPS2-side counter so EE knows which slot holds the
latest sample. That decision belongs to Ch239+, not Ch238.
P2 is deliberately left out of the first slice per Codex Ch238 framing. The next chapter can either reuse the same 16-byte slot (overwriting P1 each emit) or move to a multi-port layout (P1 at +0, P2 at +16, etc.).
Sim regression bumps from 153 → 154 PASS (new TB only, zero RTL change).
Ch239 — single-slot buffer contract (landed)
Ch238 exposed the streaming offset of
sif_dma_ee_ram_bridge_stub (each emit advances wr_offset by
16). For a libpad-style consumer that wants padRead(port, &buf)
to return a stable snapshot at a single buffer address, that's
the wrong default. Ch239 adds a narrow rewind input that lets a
producer reset the streaming offset between transfers — no other
SIF semantics change.
RTL change
One new input on
rtl/sif/sif_dma_ee_ram_bridge_stub.sv:
input logic rewind_i = 1'b0 // default — keeps existing consumers untouched
Behavior:
- When
rewind_ipulses HIGH (typically one iclk),wr_offsetreturns to32'd0on the next clock edge. The very next emit lands atDEST_BASE_ADDR + 0. - The accumulator (
acc_data,acc_be,pos) is already zeroed at every emit's tail, so rewind doesn't need to touch them. - Rewind is intended to fire between transfers — when the
bridge is idle (
state == S_ACCUM && pos == 0). Misuse is caught by a sim-only$errorassertion; the RTL still applies the rewind so the bug is loud, not silent.
The port has a 1'b0 default so existing instantiations (5 TBs,
zero RTL parents) keep their streaming behaviour without code
changes. Compile-checked against tb_sif_ee_landing_via_dmac —
passes with no modification.
Single-slot buffer contract (new convention)
A producer using rewind gets these guarantees:
| Property | Value / meaning |
|---|---|
| Buffer base | DEST_BASE_ADDR (parameter; pad-state path uses 0x0008_0000) |
| Buffer length | One 16-byte qword |
| Rewind cadence | One rewind_i pulse BEFORE each 4-beat transfer (between scenarios) |
| Stale-byte safety | Each transfer's bridge_wr_be = 16'hFFFF (all 16 bytes enabled), so a fresh full-length transfer overwrites every byte — no leftover content from a prior transfer can survive |
| Mid-transfer rewind | Illegal. Sim $error. Producer must wait for last_seen_o (or just a few clocks after the in_last beat) before pulsing rewind again |
For libpad-style single-slot semantics (padRead(port, &buf)
returning the same &buf every call), a producer pulses rewind
between each pad packet. The consumer reads from the fixed
address; the producer overwrites the slot in place.
Coverage
tb_pad_state_via_sif_to_ee updated to exercise the contract:
- Every scenario pulses
rewind_iBEFORE driving its 4 beats. - All four scenarios read from the same
EE_PAD_BUFFER_BASEaddress (no per-scenario indexing — different from the Ch238 streaming-offset workaround). - Per-scenario
check_eq128against the expected qword implicitly proves no stale bytes from prior scenarios survived: if any byte leaked through, the 128-bit equality would fire. - §3's combo pattern (
0xD6/0xEF) differs from §1/§2/§4 in multiple bit positions across both pad bytes — a partial-write bug would surface here even if simpler patterns happened to alias.
Existing tb_sif_ee_landing_via_dmac (which tests the bridge's
streaming behavior) passes unchanged with the rewind port at
its default 1'b0.
What last_seen_o means with rewind
last_seen_o is a level-held latch that rises on the in_last
beat's accept. The Ch239 rewind does NOT clear this latch — it
only touches wr_offset. A consumer can still gate on
last_seen_o to detect "any payload has landed since reset."
A future chapter that wants a per-transfer "fresh data" signal
(for libpad's padRead to know there's a new sample) will
likely add an emit_done_pulse_o strobe; that's distinct from
the rewind path and belongs with Ch240+ work.
Boundary call
Ch239 makes the single-slot buffer contract explicit and tested. A libpad-style consumer can now read a stable 16-byte pad packet at
EE_PAD_BUFFER_BASEregardless of how many pad packets the producer has emitted. The next chapter (Ch240) can either decode the EE-side SIF register window inee_memory_map_stubso EE CPU code can poll a "new sample" flag, or move on to a tiny EE-side test program that just reads from the fixed address.
Ch240 — EE-side consumer reads + branches (landed)
Ch239 stabilised the producer; Ch240 closes the consumer half with an actual EE-core program reading the buffer and branching on its contents. Per Codex framing, no EE-side SIF register decode yet — the EE program polls the fixed RAM-resident buffer directly.
EE test program
; Initialization
slot 0 LUI $1, 0x8008 ; $1 = EE_PAD_BUFFER_KSEG0 (0x80080000)
slot 1 LUI $5, 0x8000 ; $5 = EE_MARKER_KSEG0 base
slot 2 ORI $5, $5, 0x1000 ; $5 = 0x80001000
; Polling loop
LOOP: LBU $2, 3($1) ; $2 = pad byte3 (D-pad/start/select/sticks)
ORI $3, $0, 0xFF
BEQ $2, $3, MARK_A ; byte3 = 0xFF → no buttons
NOP
ORI $3, $0, 0xDF
BEQ $2, $3, MARK_B ; byte3 = 0xDF → JOY_RIGHT only
NOP
; fall-through → COMBO
COMBO: ORI $6, $0, 0xCC
SW $6, 0($5) ; marker C
J LOOP
NOP
MARK_A: ORI $6, $0, 0xAA
SW $6, 0($5) ; marker A
J LOOP
NOP
MARK_B: ORI $6, $0, 0xBB
SW $6, 0($5) ; marker B
J LOOP
NOP
22 instructions including delay slots; each loop iteration is roughly 10 instructions. The program runs continuously — every scenario the TB drives, the loop sees a new buffer value and writes a fresh marker within ~500 design-clock cycles (well inside the per-scenario wait).
Kseg0 vs useg routing (important detail)
ee_memory_map_stub routes EE-CPU writes to useg addresses
(addr[31] == 0) into an internal useg_shadow_mem array,
NOT the external ee_ram_stub. The TB's DMAC-side reader goes
through ee_ram_stub — different backing store. To make EE
writes round-trip through the same RAM the TB samples, the EE
program targets kseg0 addresses (0x80000000+):
EE_PAD_BUFFER_KSEG0 = 0x8008_0000(EE reads via LBU at this address; phys =0x0008_0000after kseg0 strip; routes toee_ram_stub)EE_MARKER_KSEG0 = 0x8000_1000(EE writes via SW at this address; same kseg0-strip routing)
The TB's DMAC-side reads use the matching physical
addresses (0x0008_0000 and 0x0000_1000) — same backing
RAM, different access port.
Verified scenarios
| § | AXI write to INPUT_P1 | Pad byte3 the EE sees | Marker written |
|---|---|---|---|
| §1 | 0x0000_0000 (no buttons) |
0xFF |
0xAA |
| §2 | 0x0000_0001 (RIGHT only) |
0xDF (bit 5 cleared) |
0xBB |
| §3 | 0x0000_0021 (RIGHT + SELECT) |
0xDE (bits 0 and 5 cleared) |
0xCC |
| §4 | 0x0000_0000 (re-clear) |
0xFF |
0xAA |
Each scenario: AXI write → 20-iclk CDC settle → IOP-side read
of PAD_P1_STATE to confirm bridge latch arrived → pulse
rewind_i → drive 4 SIF beats → wait 500 iclk for the EE
program to consume the buffer and write the marker → TB DMAC
read of marker byte → assert.
Sim regression
154 → 155 PASS (one new TB only; no production-RTL changes).
What Ch240 explicitly does NOT do
- No EE-side SIF register decode. The
ee_memory_map_stubstill doesn't decode the SIF mailbox/flag window at0x1000_F200..0x1000_F2FF. The EE program polls the RAM buffer directly instead of waiting on a doorbell. - No libpad RPC. The marker convention is TB-internal; real libpad would marshal pad state through padman.irx via SIF RPC and into a libpad-allocated buffer with a known per-port address.
- No buffer-fresh signal. The EE loop doesn't know if it's reading the latest snapshot or the same one twice — it just reads every iteration. Adding an "emit counter" the consumer can compare against is a Ch241+ option.
Audit responses (per Codex)
Loop freshness — does each scenario's marker come from the NEW packet, not stale state? Yes. Two layers of evidence:
- Each scenario has a distinct expected marker (
0xAA/0xBB/0xCC/0xAA). If the EE loop missed a buffer update and read the prior packet, the wrong marker would land and the per-scenariocheck_eq32would fire. - §4 is the "clear and observe marker returns" case: after
§3's combo write left the marker at
0xCC, §4 re-clears INPUT_P1 → byte3 returns to0xFF→ the loop branches to MARK_A → marker overwritten back to0xAA. That specifically proves the EE loop is consuming live buffer state, not caching the first read. - Per-scenario wait is 500 design-clock cycles. Each EE loop iteration is ~10 instructions × ~5 cycles each ≈ 50 cycles, so the wait covers ~10 loop iterations — plenty of slack.
Branch semantics — markers keyed to cleared bits (active-low), not set bits? Yes:
0xFF(all bits SET) = no buttons pressed → MARK_A. Set bits = released. ✓0xDF(bit 5 CLEARED) = JOY_RIGHT pressed → MARK_B. The cleared bit is what indicates "pressed." ✓0xDE(bits 5 AND 0 CLEARED) = JOY_RIGHT + JOY_SELECT pressed → falls through to MARK_C. ✓
A polarity inversion would be visible: e.g. if the program
treated 0xFF as "all pressed" and branched to MARK_C, §1
would land 0xCC instead of 0xAA and the test would fire.
The fact that §1 + §4 both successfully match MARK_A on the
"no buttons" stimulus proves the active-low semantics are
honored end-to-end (sio2_input_stub's per-bit inversion +
the EE program's branch direction).
Boundary call
The full input arc is sim-validated end-to-end: HPS writes INPUT_P1 → bridge latches → IOP-side sio2_input_stub translates to Sony pad bytes → producer packs a 16-byte Sony struct → SIF DMA drops it into EE RAM at a fixed slot (Ch239 rewind keeping the slot stable) → EE-side MIPS code branches on a button bit → writes a per-scenario marker the consumer-side TB samples. Active-low + freshness + clear- and-restore behaviors are all covered by the existing tb_ee_pad_buffer_branch §1–§4 scenarios. Next options: EE-side SIF mailbox/flag decode (Ch242+), per-emit "new sample" gating, or pivot back to a different arc — input is done as far as platform RTL is concerned.
Original recon (Ch233)
Why this doc exists
Ch222–Ch232 made the retroDE platform shell live on PS2: HPS writes
controller bitmaps into ps2_hps_bridge.INPUT_P1/P2/P1_RAW (offsets
0x040/0x044/0x048), the OSD compositor renders text over PS2 video, and
the supervisor menu round-trip is silicon-validated. The next bridge to
build is between HPS-visible input latches and PS2-side software
that wants to read controller state (eventually a real BIOS / game).
This doc maps that gap so the next code chapter has a small, named target instead of an open question.
Scope (Codex Ch233 framing)
- Survey existing PS2-side stubs touching SIO2 / pad / controller paths.
- Document what the real PS2 BIOS/game touches first for controller input.
- Map Ch222
INPUT_P1/INPUT_P1_RAWbits into a proposed internal pad state format. - Identify the minimal MMIO surface to expose pad status to EE/IOP-side code.
- No RTL — the implementation chapter follows.
What exists today
HPS side (Ch222 — landed, silicon-validated by Ch226 DS2 stub)
ps2_hps_bridge.INPUT_P1@ 0x040 (32-bit RW latch, retroDE SNES-style bitmap frominput_common.h).ps2_hps_bridge.INPUT_P2@ 0x044 (player 2 latch).ps2_hps_bridge.INPUT_P1_RAW@ 0x048 (un-remapped mirror used by retrodesd's OSD nav FSM in other cores).ps2_hps_bridge.DS2_BUTTONS@ 0x0F4 (Ch226 read-only mirror of INPUT_P1; sibling-ABI DS2 path for retrodesd).retrodesd/software/input_thread.cis the producer — evdev → remap → 32-bit AXI write into these offsets.
PS2 side
- No SIO2 stub.
docs/stub_module_plan.md:317reservesrtl/peripherals/sio2_input_stub.svas "Wave 2 #12", explicitly the last stub before "Wave 3 promotions" — never written. - No pad MMIO decode in
iop_memory_map_stub.svfor the SIO2 region (0x1F80_8200..0x1F80_82FFon real hardware). - No EE-side libpad path —
ee_memory_map_stub.svhas no RPC/SIF awareness of controller state. - The IOP map's "Future regions" comment block (in
rtl/iop/README.md:149) lists "Other IOP DMAC channels (CDVD / SPU2 / DEV9 / SIF1-2 / SIO2)" as deferred.
The platform shell talks to itself — HPS writes a latch, HPS reads it back (via Ch226 DS2_BUTTONS mirror). Nothing on the PS2 fabric side consumes the bits, which is the gap Ch233+ will close.
Real PS2 controller path (for orientation)
A real game running on a stock PS2 sees controller input through this chain (top → bottom in time):
Physical DualShock 2
│ (custom serial protocol, ~250 kHz)
▼
SIO2 controller block @ IOP 0x1F80_8200..0x1F80_82FF
│ (FIFO + command/response + DMA channel 11)
▼
IOP RAM (padman.irx — Sony's pad daemon)
│ - issues SIO2 transactions every vsync
│ - parses the response into a 16-byte pad state struct
│ - publishes the struct to a known IOP RAM address
▼
SIF (RPC channel)
│ - EE-side libpad opens an RPC channel
│ - calls padRead(port, &state) → marshals 16 bytes
│ of pad state over SIF DMA to EE-side buffer
▼
EE RAM (libpad-allocated buffer)
│ - game / BIOS reads the 16 bytes directly
▼
Game logic
Where the bytes live in the 16-byte pad state (the format
libpad/padman use, Sony's "digital mode" / type 0x4 response):
| Byte | Bit | Function | Active-low? |
|---|---|---|---|
| 0 | - | success status | usually 0x00 / 0xFF |
| 1 | - | report type / pad-state-machine | 0x41 = digital, 0x73 = analog |
| 2 | - | success token | |
| 3 | 7 | LEFT | 0 = pressed |
| 3 | 6 | DOWN | 0 = pressed |
| 3 | 5 | RIGHT | 0 = pressed |
| 3 | 4 | UP | 0 = pressed |
| 3 | 3 | START | 0 = pressed |
| 3 | 2 | R3 | 0 = pressed |
| 3 | 1 | L3 | 0 = pressed |
| 3 | 0 | SELECT | 0 = pressed |
| 4 | 7 | □ (square) | 0 = pressed |
| 4 | 6 | × (cross) | 0 = pressed |
| 4 | 5 | ○ (circle) | 0 = pressed |
| 4 | 4 | △ (triangle) | 0 = pressed |
| 4 | 3 | R1 | 0 = pressed |
| 4 | 2 | L1 | 0 = pressed |
| 4 | 1 | R2 | 0 = pressed |
| 4 | 0 | L2 | 0 = pressed |
| 5–8 | - | RX, RY, LX, LY | analog (0x80 centered, digital mode reports 0x80) |
| 9-15 | - | pressure / reserved (DualShock 2 only) |
Active-low semantics: every bit is 0 when the button is pressed.
retroDE's INPUT_P1 from input_common.h is active-high.
The translation layer must invert per-bit.
What software reads first. The Sony BIOS doesn't poll controllers
during its own boot — the first pad transactions come from
OSDSYS (the in-BIOS browser) and game executables linking
libpad. So:
- For a BIOS-bring-up smoke test, no pad surface is required.
- For an OSDSYS-driven boot path, OSDSYS expects the SIF
RPC server
RPCID 0x80000100(padman) to answer with a 16-byte pad state on everypadReadcall. - For homebrew or game code, libpad's standard API is the observable surface; the implementation strategy (faithful SIO2 vs simplified RPC vs simplified MMIO) is opaque to the caller.
Proposed mapping (Ch222 → Sony pad state)
Following the peripherals.md:30 open question ("simplified
abstraction vs SIO2-faithful transactions?") the recon answer is:
start with a simplified abstraction. SIO2-faithful transactions
require IOP code that runs the protocol — fine for late-Wave-2 work
but not the smallest useful first step.
INPUT_P1 bit assignments (from input_common.h) map to Sony pad
state per the following table. SNES-style face buttons fold onto
DualShock face buttons by spatial layout (Y top, B bottom,
X left, A right — same as the standard SNES → PSX mapping retroDE
already uses on coco2 / a2600):
| INPUT_P1 bit | retroDE name | PS2 button (Sony name) | Pad-state byte.bit |
|---|---|---|---|
| 0 | JOY_RIGHT | RIGHT (D-pad) | 3.5 |
| 1 | JOY_LEFT | LEFT (D-pad) | 3.7 |
| 2 | JOY_DOWN | DOWN (D-pad) | 3.6 |
| 3 | JOY_UP | UP (D-pad) | 3.4 |
| 4 | JOY_START | START | 3.3 |
| 5 | JOY_SELECT | SELECT | 3.0 |
| 6 | JOY_Y | △ (triangle, top) | 4.4 |
| 7 | JOY_B | × (cross, bottom) | 4.6 |
| 8 | JOY_X | □ (square, left) | 4.7 |
| 9 | JOY_A | ○ (circle, right) | 4.5 |
| 10 | JOY_L | L1 | 4.2 |
| 11 | JOY_R | R1 | 4.3 |
| 12 | JOY_L2 | L2 | 4.0 |
| 13 | JOY_R2 | R2 | 4.1 |
| 14 | JOY_L3 | L3 | 3.1 |
| 15 | JOY_R3 | R3 | 3.2 |
| 16 | JOY_OSD | — (consumed by retrodesd, not forwarded) | — |
Inversion rule: each PS2 byte starts at 0xFF (all released);
each INPUT_P1 bit that's 1 clears the corresponding pad-state
bit to 0. Two assigns of 8-bit pad bytes do the whole thing
combinationally:
pad_state[3] = ~{INPUT_P1[1], INPUT_P1[2], INPUT_P1[0], INPUT_P1[3],
INPUT_P1[4], INPUT_P1[15], INPUT_P1[14], INPUT_P1[5]};
pad_state[4] = ~{INPUT_P1[8], INPUT_P1[7], INPUT_P1[9], INPUT_P1[6],
INPUT_P1[11], INPUT_P1[10], INPUT_P1[13], INPUT_P1[12]};
(Order inside {} is MSB→LSB to match the Sony bit numbering.)
Proposed minimum MMIO surface
For the smallest possible useful "PS2 code can read controller state" path:
Option A — IOP-readable PS2-local register (recommended).
Add a single 32-bit read-only register on the IOP MMIO bus that packs the two pad-state bytes plus a presence/status word:
| IOP phys offset | Name | Layout (32-bit) |
|---|---|---|
0x1F80_8500 |
PAD_P1_STATE |
[7:0]=byte3 (D-pad/SEL/START), [15:8]=byte4 (face/shoulder), [16]=connected=1, [17]=error=0, [31:18]=0 |
0x1F80_8504 |
PAD_P2_STATE |
Same layout, sourced from INPUT_P2 |
0x1F80_8500..0x1F80_85FF is a retroDE-local I/O range, not
Sony-compatible. It deliberately sits outside the real SIO2 range
(0x1F80_8200..0x1F80_82FF) so that landing real SIO2 emulation later
doesn't collide. Bit fields are little-endian to match the IOP's
native byte ordering.
IOP-side code (a small "fake padman" routine loaded at known address,
or a future BIOS-replacement RPC server) reads PAD_P1_STATE, writes
the 16-byte Sony pad state into the agreed EE-visible memory location,
and signals via SIF.
Option B — SIF mailbox pad state.
Skip IOP code entirely. Add a mailbox in sif_mailbox_stub that
the EE can read directly without any IOP cooperation. Faster to
demo but breaks libpad's RPC contract — homebrew built against
libpad won't work without a shim.
Option C — faithful SIO2 emulation.
Real 0x1F80_8200..0x1F80_82FF register surface, real FIFO,
real DMA channel 11, real command/response protocol. padman.irx
runs unchanged. Largest scope by far — defers to a later
chapter once Option A is proven.
Recommendation: A → B → C as separate chapters. Most game/BIOS code talks to libpad, which talks to padman over SIF — Option A gives the smallest fabric surface that lets a stub padman work.
Proposed Ch234+ implementation chapters
| Chapter | Scope |
|---|---|
| Ch234 | rtl/peripherals/sio2_input_stub.sv (Option A): single module, two read-only 32-bit registers; combinationally maps Ch222 INPUT_P1/P2 latches into PS2 pad-state bytes with the inversion rule above; IOP map decode added at 0x1F80_8500..0x1F80_85FF. Bridge gets a new output port carrying INPUT_P1/P2 into the IOP domain (single-bit register-stable signals, no CDC needed beyond the existing reset-sync because they update at retrodesd's 1 kHz rate). New focused TB: write INPUT_P1, read PAD_P1_STATE through the IOP map, verify the inversion + bit order. |
| Ch235 | Either ramp Ch234 into Option B (SIF mailbox), or extend Ch234 to expose pad analog stick values (currently libpad reports 0x80 centered in digital mode — match that). Decision deferred per the BIOS-bringup observations. |
| Ch236+ | Real SIO2 emulation (Option C) once a known BIOS or homebrew demands it. |
Out of scope for this contract
- Analog stick fidelity beyond "report 0x80 centered" (the
INPUT_P1bitmap is digital-only; full DualShock 2 analog requires a separate retrodesd-side path). - Pressure-sensitive buttons (DualShock 2 only).
- Multitap support (most PS2 software doesn't require it for bringup).
- Real SIO2 timing fidelity (the simplified register is combinational; real SIO2 has a multi-cycle command/response protocol).
- Vibration / actuator feedback (output direction; needs EE → HPS path, not relevant for input recon).
Boundary call
The HPS-to-bridge half of the input path landed in Ch222 and is silicon-validated; the bridge-to-PS2-fabric half is open. Ch234 adds a small IOP-readable
sio2_input_stubat the retroDE-local I/O range0x1F80_8500..0x1F80_85FFthat combinationally translatesINPUT_P1/INPUT_P2into Sony pad bytes; IOP code (eventually a stub padman) reads the registers and publishes the 16-byte pad state via SIF for EE-side libpad. Faithful SIO2 emulation is deferred until a real BIOS or homebrew needs it.