ec82764bef
RTL (GS rasterizer, EE core stub, platform bridge, LPDDR4B path), sim regression (272 TBs), docs, and tooling. Copyrighted PS2 content (BIOS, game code, GS dumps, and all dump-derived textures/traces) is excluded via .gitignore and stays local. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
3449 lines
174 KiB
Markdown
3449 lines
174 KiB
Markdown
# DE25-Nano bring-up runbook (PSMCT32 raster demo)
|
||
|
||
This is the operator-side checklist for taking a fresh build and
|
||
getting it to drive a real HDMI monitor on the Terasic DE25-Nano
|
||
(Agilex 5 `A5EB013BB23BE4SCS`). It pairs with the RTL bring-up
|
||
history captured in
|
||
`rtl/top/de25_nano_psmct32_raster_demo_top.sv` (Ch149 → Ch170).
|
||
|
||
## Quick-start (operator's three steps)
|
||
|
||
1. **Build** — `./synth/de25_nano/top_psmct32_raster_demo/build_quartus.sh`
|
||
produces `retroDE_ps2.core.rbf` in `output_files/`.
|
||
2. **Load** — `scp` the RBF to the board, then
|
||
`sudo /home/terasic/core_loader.sh load /home/terasic/cores/retroDE_ps2.core.rbf`
|
||
on the HPS.
|
||
3. **Verify** — `./ps2_status.sh --delta` on the HPS prints the full status
|
||
block + counter deltas; exit status is 0 if the ps2 fabric is healthy.
|
||
See "Status block" below for the canonical output.
|
||
|
||
Full detail and triage for each step follows.
|
||
|
||
## Build
|
||
|
||
There are two equivalent ways to compile.
|
||
|
||
### Option A — Quartus GUI (matches the rest of the retroDE family)
|
||
|
||
1. Open `synth/de25_nano/top_psmct32_raster_demo/de25_nano_psmct32_raster_demo_top.qpf`
|
||
in Quartus Prime Pro 25.3.1 SP1.02.
|
||
2. *Processing → Start Compilation* (or the green ▶ button).
|
||
3. After the Assembler step, the `post_flow.tcl` hook runs
|
||
automatically and emits the boot artifacts described below.
|
||
|
||
The QSF references sources via `rtl/...`, `sim/...`, and (Ch170)
|
||
`qsys/...` paths relative to its own directory. Matching
|
||
`build_quartus.sh`'s convention of running from a `work/`
|
||
subdirectory with those trees symlinked in. The same pattern
|
||
works for the GUI because the QSF directory contains permanent
|
||
symlinks:
|
||
|
||
```
|
||
synth/de25_nano/top_psmct32_raster_demo/rtl -> ../../../rtl
|
||
synth/de25_nano/top_psmct32_raster_demo/sim -> ../../../sim
|
||
synth/de25_nano/top_psmct32_raster_demo/qsys -> ../../../qsys
|
||
```
|
||
|
||
If those symlinks are missing the GUI compile fails with
|
||
"Can't analyze file rtl/... - no such file exists" or
|
||
"Error opening qsys/qsys_top.qsys" errors. Recreate from the
|
||
QSF directory:
|
||
|
||
```
|
||
ln -sfn ../../../rtl rtl
|
||
ln -sfn ../../../sim sim
|
||
ln -sfn ../../../qsys qsys
|
||
```
|
||
|
||
### Option B — command line
|
||
|
||
```bash
|
||
./synth/de25_nano/top_psmct32_raster_demo/build_quartus.sh
|
||
```
|
||
|
||
(or `make -C sim quartus_compile`). Runs `quartus_syn → fit → sta
|
||
→ asm`. The assembler step is gated on a clean STA so the `.sof`
|
||
is produced only when timing closes.
|
||
|
||
### Build artifacts
|
||
|
||
All paths under
|
||
`synth/de25_nano/top_psmct32_raster_demo/output_files/`:
|
||
|
||
| Artifact | What it is |
|
||
|------------------------------------------------|-------------------------------------------------------------|
|
||
| `de25_nano_psmct32_raster_demo_top.sof` | bare fabric SOF (JTAG / debug fallback) |
|
||
| `de25_nano_psmct32_raster_demo_top.rbf` | plain configuration RBF from `quartus_pfg` |
|
||
| **`retroDE_ps2.core.rbf`** | **stable deploy artifact — what `core_loader.sh` consumes** |
|
||
|
||
Per-step logs (`*.flow.rpt`, `*.fit.rpt`, `*.sta.summary`) land
|
||
in the same directory.
|
||
|
||
> **Ch168→Ch170 evolution (resolved)**: Ch168's plain RBF loaded
|
||
> through `core_loader.sh` (fpga0 reached `operating`) but the
|
||
> resulting system was hosed — SSH died, board required power-cycle.
|
||
> Diagnosis: the fabric was running but had no HPS bridge endpoints,
|
||
> so HPS-side AXI transactions stalled into the void. Ch170 fixed
|
||
> this by copying the canonical retroDE qsys subsystem verbatim
|
||
> from `retroDE_Atari2600/qsys/`, adding the HPS + LPDDR4A pin
|
||
> blocks, instantiating `qsys_top soc_inst (...)` with a minimal
|
||
> `ps2_hps_bridge_null` AXI4 slave on the hps2fpga bridge, and
|
||
> adding the `HPS_INITIALIZATION "HPS FIRST"` global + the
|
||
> `PRESERVE_REGISTER` / `PRESERVE_FANOUT_FREE_NODE` instance
|
||
> assignments on the Sundancemesa MPFE f2sdram path (without
|
||
> those, the optimizer fails the build with "F2SDRAM_RDATA not
|
||
> legally connected"). Also fixed: the QSF was referencing
|
||
> `QIP_FILE ip/pll/pll.qip` (cached output) instead of
|
||
> `IP_FILE ip/pll.ip` (source), so the .ip frequency edit at
|
||
> Ch169 silently didn't propagate — the PLL stayed at 50 MHz and
|
||
> the monitor reported "No support" because the raster was
|
||
> 640×480 at 119 Hz. With `IP_FILE` and the PLL retuned to
|
||
> **25.0 MHz** (cleanly achievable from the 50 MHz refclk via M=1
|
||
> / N=1 / C=2), the design now drives 640×480 @ ~59.52 Hz and
|
||
> monitors lock cleanly.
|
||
>
|
||
> **First hardware-real bring-up**: the `retroDE_ps2.core.rbf`
|
||
> produced by the Ch170 build is the first that:
|
||
> 1. Loads through `core_loader.sh` and keeps the HPS-side Linux
|
||
> running normally (SSH stays up, board does not need a
|
||
> power-cycle).
|
||
> 2. Locks a standard HDMI 640×480@60 mode on the monitor.
|
||
> 3. Paints recognisable pixels from the PCRTC (the Ch123 16×8
|
||
> test sprite is visible as 4 colored quadrants in the
|
||
> upper-left of the otherwise-black 640×480 frame).
|
||
|
||
## Programming
|
||
|
||
Two paths. The standard retroDE runtime-reload flow is the primary
|
||
path; JTAG is the debug/fallback.
|
||
|
||
### Path 1 — runtime fabric reload (primary)
|
||
|
||
This is the standard retroDE workflow: HPS is up and Linux is
|
||
running (booted from the QSPI-resident retroDE_splash image), and
|
||
you swap the FPGA fabric at runtime via the on-board loader. The
|
||
device-tree overlay points the Linux fpga-manager at the new RBF
|
||
in `/lib/firmware/`; the previous fabric is unloaded automatically.
|
||
|
||
Copy the deploy artifact to the board, then:
|
||
|
||
```bash
|
||
sudo /home/terasic/core_loader.sh load \
|
||
/home/terasic/cores/retroDE_ps2.core.rbf
|
||
```
|
||
|
||
The loader script is in
|
||
[`retroDE_splash/software/core_loader.sh`](../../../retroDE_splash/software/core_loader.sh).
|
||
It is filename-agnostic — it copies whatever RBF you point it at
|
||
to `/lib/firmware/` and applies the `/fpga-region` DT overlay.
|
||
|
||
If `fpga-manager` rejects the RBF (overlay status != applied,
|
||
`dmesg` shows configuration errors), see "What's deferred" below.
|
||
|
||
### Path 2 — bare-metal JTAG (debug / fallback)
|
||
|
||
Useful when HPS isn't running (cold board) or when you want to
|
||
program without disturbing the loader pipeline:
|
||
|
||
```bash
|
||
quartus_pgm -m jtag -o "p;synth/de25_nano/top_psmct32_raster_demo/output_files/de25_nano_psmct32_raster_demo_top.sof"
|
||
```
|
||
|
||
`p` = program (volatile — bitstream is lost on power-cycle).
|
||
Note that JTAG programming will tear down the HPS-running splash
|
||
fabric; a power-cycle restores it from QSPI.
|
||
|
||
### retrodesd manifest (Ch219)
|
||
|
||
When the ps2 core is registered with retrodesd (the userspace
|
||
supervisor on HPS), the manifest stanza is:
|
||
|
||
```ini
|
||
[ps2]
|
||
name=PS2 Graphics Demo
|
||
rbf=/home/terasic/cores/retroDE_ps2.core.rbf
|
||
backend=splash
|
||
core_id=0x50533200
|
||
min_abi=0x00000100
|
||
```
|
||
|
||
`backend=splash` is correct for the current demo (no
|
||
session/ROM/save semantics). The stanza follows the same shape as
|
||
the other retroDE cores (nes, coco2, atari2600) and matches the
|
||
ABI v1.0 contract exposed by `ps2_hps_bridge` (CORE_ID =
|
||
`0x50533200`, ABI_VERSION = `0x00000100`). The direct
|
||
`core_loader.sh load` path remains useful for bring-up and
|
||
JTAG-replacement workflows; retrodesd is for production deploys.
|
||
|
||
## LED ledger
|
||
|
||
The DE25-Nano has 8 user LEDs (`LED[7:0]`), all **active-LOW** (lit =
|
||
asserted internally). In this top:
|
||
|
||
| LED | Source | Polarity meaning |
|
||
|----------|---------------------|------------------------------|
|
||
| LED[0] | `core_halt` | **Ch251: unlit normal** (EE running the animated paint loop). Lit = EE SYSCALL'd unexpectedly, demo stuck. Pre-Ch251 it was lit = bootlet completed. |
|
||
| LED[1] | `dma_done_seen` | lit = DMA finish observed (good) |
|
||
| LED[2] | `frame_seen` | lit = PCRTC produced a frame (good) |
|
||
| LED[3] | `hdmi_init_done` | lit = ADV7513 I²C config done (good) |
|
||
| LED[4] | `hdmi_i2c_error` | lit = NACK watchdog latched (**bad**) |
|
||
| LED[5] | `p1_sony_word[3]` (START) | lit = START held on the wired pad (Ch250) |
|
||
| LED[6] | `p1_sony_word[14]` (CROSS ×) | lit = ×/B held on the wired pad (Ch250) |
|
||
| LED[7] | `p1_sony_word[4]` (D-pad UP) | lit = D-pad UP held on the wired pad (Ch250) |
|
||
|
||
In steady state on a healthy boot (Ch251 animated demo), `LED[1..3]`
|
||
are lit and `LED[4]` stays dark; `LED[0]` **stays unlit** because the
|
||
animated bootlet loops forever instead of SYSCALL-halting. The
|
||
order is *roughly* the one below but is not strictly deterministic —
|
||
`LED[1..2]` are pure-fabric and light within a few clock cycles of
|
||
`core_go`, while `LED[3]` depends on the ~125 ms I²C walk (see step
|
||
5) so it always lights *last* of the three. Pre-Ch251 the success
|
||
indicator was "LED[0..3] all lit"; the Ch251 indicator is "frame
|
||
heartbeat visible + FRAME_COUNT advancing + RASTER_OVERFLOW_COUNT = 0."
|
||
|
||
## Expected behavior on first boot
|
||
|
||
1. FPGA configures (`Programming complete` on JTAG, or `fpga0:
|
||
operating` after `core_loader.sh load`).
|
||
2. LED[0] eventually lights (EE core halts after running the
|
||
bootlet — typically within a few ms of reset deassertion).
|
||
3. LED[1] eventually lights as the DMAC reports `done`.
|
||
4. LED[2] eventually lights once the PCRTC delivers its first
|
||
frame.
|
||
5. LED[3] eventually lights — the ADV7513 I²C config FSM walks
|
||
the 38-entry LUT. Walk time is **~125 ms at the production
|
||
divider** (controller-clock period ~100 µs × 33 phases/byte ×
|
||
38 bytes = ~125 ms), so LED[3] will be the slowest of the four
|
||
to light. The `CLK_Freq / I2C_Freq` divider is `50e6/20e3 =
|
||
2500`; one register write is ≈ 3.3 ms.
|
||
6. LED[4] **stays dark** (no NACKs on the I²C bus).
|
||
7. The HDMI monitor reports a **VGA 640×480 @ ~60 Hz** signal (Ch169
|
||
retuned the IOPLL to 25.0 MHz with standard VGA timing). Ch171
|
||
bumped the PCRTC paint area to a **320×240 four-quadrant test
|
||
card** in the upper-left of the 640×480 frame:
|
||
- **TL** Q0 = **RED** (255, 0, 0)
|
||
- **TR** Q1 = **GREEN** (0, 255, 0)
|
||
- **BL** Q2 = **BLUE** (0, 0, 255)
|
||
- **BR** Q3 = **WHITE** (255, 255, 255)
|
||
The remaining ~50% of the screen (x ≥ 320 or y ≥ 240) is black
|
||
because PCRTC outside the DISPLAY1 window emits 0. This is the
|
||
"obvious from across the room, photographable" hardware
|
||
heartbeat — easy to confirm the whole DMAC/GIF/GS/raster/VRAM/
|
||
PCRTC/HDMI path is alive without squinting at a 16×8 patch.
|
||
|
||
## Triage
|
||
|
||
### No LEDs light at all
|
||
|
||
- Check JTAG: `Programming complete` was reported by `quartus_pgm`?
|
||
- Check `ninit_done` polarity — if the FPGA never deasserts init, the
|
||
whole design stays in reset and no LED ever lights.
|
||
- Re-program; some Agilex 5 dev boards need a power-cycle between
|
||
bitstream loads.
|
||
|
||
### LED[0..2] light but LED[3] never lights (HDMI never inits)
|
||
|
||
- Is LED[4] lit? → see next section.
|
||
- HDMI cable plugged in? The chip drives I²C even with no HDMI cable,
|
||
so HDMI cable absence does NOT block LED[3]. But HPD/INT lines may
|
||
oscillate, retriggering the FSM forever.
|
||
- Is `HDMI_I2C_SCL` toggling? Probe the test point or the ADV7513
|
||
side. No SCL = the bit-bang master isn't running (clock domain or
|
||
reset issue).
|
||
|
||
### LED[4] lit (NACK watchdog latched)
|
||
|
||
The watchdog asserts after 16 consecutive NACKs on the same LUT
|
||
entry. Likely causes (in priority order):
|
||
|
||
- HDMI cable not plugged in **and** the ADV7513 power rail isn't
|
||
bringing the chip out of standby. The DE25-Nano feeds chip power
|
||
unconditionally, so this should not happen on a healthy board.
|
||
- I²C address mismatch. Slave address in `I2C_HDMI_Config.v` is
|
||
`8'h72` (`mI2C_DATA <= {8'h72, LUT_DATA}`). DE25-Nano boards wire
|
||
the ADV7513's `PD#` and address-select pins so this is the chip
|
||
address — but if a board variant differs, every byte will NACK.
|
||
- SDA short to ground or VCC.
|
||
- ADV7513 reset (`HDMI_TX_RST_N` if exposed) held asserted.
|
||
|
||
LED[4] is sticky — once latched it stays lit until FPGA is
|
||
reconfigured. Re-program the bitstream to retry.
|
||
|
||
### LED[3] lights but no HDMI image
|
||
|
||
- Try a different monitor / HDMI cable. The demo emits 24-bit RGB
|
||
full-range with separate H/V sync; some monitors are picky about
|
||
edge timing.
|
||
- Ch164's `set_false_path -to HDMI_TX_*` is in place — proper
|
||
output-delay constraints (ADV7513 setup/hold) are deferred to a
|
||
later chapter. Output-pin timing on real silicon may be marginal
|
||
on long cables, but typical bench cables are short enough that the
|
||
PSMCT32 demo pixel rate (50 MHz pixel clock) doesn't push it.
|
||
- Check for HDMI-only HDCP requirements on the monitor — the demo
|
||
doesn't drive HDCP.
|
||
|
||
### Image flickers / unstable
|
||
|
||
- Most likely cause: the `HDMI_TX_INT` line is bouncing, retriggering
|
||
the LUT walk. LED[3] should drop briefly each time. Confirm by
|
||
watching LED[3] — if it blinks, the HPD pin is unstable.
|
||
- Workaround for monitor compatibility: the `0xD6 = 0xC0` HPD
|
||
override in the LUT (entry 1) is supposed to ignore HPD; if your
|
||
monitor still misbehaves, the override may need adjustment.
|
||
|
||
## HPS↔fabric status registers (Ch173 / Ch174 / Ch176)
|
||
|
||
After `core_loader.sh load` brings the ps2 fabric up, HPS userspace
|
||
can read live status from the `ps2_hps_bridge` AXI slave on the
|
||
hps2fpga bridge. Registers are read-only; each is a 32-bit value
|
||
addressable by its byte offset.
|
||
|
||
The first 32 bytes (0x000–0x01F) mirror the shared retroDE ABI v1.0
|
||
layout (matches splash/nes/coco2 — see
|
||
`retroDE_splash/software/input_common.h`). PS2-specific diagnostics
|
||
start at 0x020.
|
||
|
||
| Offset | Register | Meaning |
|
||
|--------|-----------------------|------------------------------------------------------|
|
||
| 0x000 | CORE_ID | `0x50533200` (ASCII `"PS2\0"`) |
|
||
| 0x004 | ABI_VERSION | `0x00000100` (v1.0) |
|
||
| 0x008 | CORE_STATUS | bit 0=loaded, 1=core_halt, 2=dma_done_seen, 3=frame_seen, 4=hdmi_init_done, 5=hdmi_i2c_error, 6=raster_overflow |
|
||
| 0x00C | CORE_CAPS | `0x00000000` (no caps advertised; ROM/savestate bits go here later) |
|
||
| 0x010 | CORE_CTRL | Ch176 writable + readable. `[0]=RESET` is functional (held high → PS2 design in reset; counters keep ticking from the bridge side). `[1]=ROM_LOADED` and `[2]=PAUSE` are pure latches (ABI shape only, no functional effect today). `[31:3]` ignored on write |
|
||
| 0x014 | CORE_PULSE | Ch176 self-clearing pulse register (always reads 0). `[3]=HDMI_CLR` zeros FRAME_COUNT / DMA_DONE_COUNT / RASTER_OVERFLOW_COUNT synchronously. `[0..2]` (DIRTY_CLR / SS_DONE_CLR / VIDEO_REC) ACK'd-and-ignored |
|
||
| 0x018 | reserved | 0 |
|
||
| 0x01C | reserved | 0 |
|
||
| 0x020 | FRAME_COUNT | Ch174: increments on each *edge* of the design-domain `frame_toggle` (flips on every PCRTC end-of-frame). 25→50 MHz CDC done via event-toggle, not raw pulse — see `ps2_hps_bridge` header for the CDC contract |
|
||
| 0x024 | DMA_DONE_COUNT | Ch174: increments on each edge of `dma_done_toggle` (flips on every DMAC `EV_DMA_DONE`) |
|
||
| 0x028 | RASTER_OVERFLOW_COUNT | cycles `raster_overflow` stayed high; **0 under Ch172 backpressure** |
|
||
| 0x060 | VIDEO_STATUS | Ch225 read-only diagnostic. `[0]=frame_seen` (live sticky), `[1]=scanout_alive` (= `FRAME_COUNT != 0`, clears with HDMI_CLR), `[2]=raster_overflow` (live, drops if input dies), `[3]=dma_done_seen` (live sticky); `[31:4]` reserved-zero. Writes ack-and-ignored |
|
||
| 0x064 | HDMI_DIAG | Ch225 read-only diagnostic. `[0]=hdmi_init_done`, `[1]=hdmi_i2c_error`; `[31:2]` reserved-zero. Writes ack-and-ignored |
|
||
| 0x0F0 | DS2_STATUS | Ch226 read-only stub. Sibling-ABI: `[0]=connected=0` (no DS2 path), `[1]=error=0`. PS2-local: `[2]=input_latches_valid=1` (Ch222 OK). Reads = `0x00000004`. Writes ack-and-ignored. retrodesd's `ds2_poll_thread.c` sees `connected=0` and never invokes `osd_input_update_ds2` |
|
||
| 0x0F4 | DS2_BUTTONS | Ch226 read-only mirror of INPUT_P1 (`0x040`). Lets operator tooling confirm the Ch222 store via the same offset sibling cores expose. With `DS2_STATUS[0]=0` retrodesd never reads it; the mirror is for HPS diagnostics |
|
||
| 0x1000–0x1FFF | Tile RAM | Ch227 write/read 1024 × 32-bit storage (4 KB). HPS-visible only; no PCRTC overlay path yet (Ch228 wires that). Software view per `input_common.h` is 2048 × 16-bit cells packed 2/word; at the bus level it's plain 32-bit words. Retained across warm reset (sim-initialized to 0; hardware power-up undefined) |
|
||
| 0x040 | INPUT_P1 | Ch222 write/read latch (HPS-visible). retrodesd's `input_thread` writes the remapped player-1 gamepad bitmap here. Reset = 0. No SIO2/DualShock wiring on the PS2 side yet — the latch is read back for software introspection only |
|
||
| 0x044 | INPUT_P2 | Ch222 write/read latch (HPS-visible). Player-2 bitmap. Reset = 0 |
|
||
| 0x048 | INPUT_P1_RAW | Ch222 write/read latch (HPS-visible). Un-remapped mirror of the player-1 buttons (the OSD-navigation source on other cores). Reset = 0 |
|
||
| 0x100 | OSD_CTRL | Ch223 write/read latch (HPS-visible). Splash backend writes ENABLE / FORCE_OPEN / INPUT_LOCK here; **no overlay rendering yet** — the bridge stores the value for software introspection only. Reset = 0 |
|
||
| 0x104 | OSD_STATUS | Ch224 always-zero source. No FSM drives `[0]=active` / `[12:8]=cursor_row` on PS2 yet, so reads always return 0; writes accepted with BRESP=OKAY and ignored |
|
||
| 0x108 | OSD_TRIGGER | Ch224 W1C-shape sink. retrodesd polls and writes-1-to-clear `OSD_TRIG_ACTION/BACK/SCROLL_*` bits; PS2 has no FSM source so the bits stay 0 and W1C is a no-op against an already-zero register. Reads always 0 |
|
||
| 0x10C | OSD_INPUT | reserved — reads 0; writes accepted-and-ignored. (Sibling cores use this as a debug-override; PS2 has no use until an overlay engine exists) |
|
||
| 0x110 | OSD_CFG0 | Ch223 write/read latch (HPS-visible). cols / rows / x_chars / y_chars layout per `input_common.h`. Reset = 0 |
|
||
| 0x114 | OSD_CFG1 | Ch223 write/read latch (HPS-visible). first_row / last_row + highlight + normal attrs. Reset = 0 |
|
||
|
||
The hps2fpga bridge base on this board is **`0x40000000`** (per
|
||
`retroDE_splash/software/input_common.h: #define HPS2FPGA_BASE
|
||
0x40000000`). After loading `retroDE_ps2.core.rbf`:
|
||
|
||
```bash
|
||
sudo devmem2 0x40000000 w # CORE_ID — expect 0x50533200
|
||
sudo devmem2 0x40000004 w # ABI_VERSION — expect 0x00000100
|
||
sudo devmem2 0x40000008 w # CORE_STATUS — bits 0..5 (and 6 if FIFO ever overflowed)
|
||
sudo devmem2 0x4000000C w # CORE_CAPS — 0
|
||
sudo devmem2 0x40000020 w # FRAME_COUNT — Ch174: ticks @ ~60 Hz; repeated reads INCREASE
|
||
sudo devmem2 0x40000024 w # DMA_DONE_COUNT — Ch174: 1 after the boot DMA (no second DMA unless retriggered)
|
||
sudo devmem2 0x40000028 w # RASTER_OVERFLOW_COUNT — 0 under Ch172 backpressure
|
||
```
|
||
|
||
### Ch176 — exercising the writable control surface
|
||
|
||
```bash
|
||
# Verify CORE_CTRL latches without resetting the core.
|
||
sudo devmem2 0x40000010 w 0x06 # PAUSE | ROM_LOADED, RESET clear
|
||
sudo devmem2 0x40000010 w # expect readback 0x00000006
|
||
sudo devmem2 0x40000010 w 0x00 # back to clean
|
||
|
||
# Hold the PS2 design in reset, observe the monitor blank.
|
||
# FRAME_COUNT stops advancing during reset; release brings the
|
||
# test card back and the counter resumes counting from its
|
||
# pre-reset value (RESET does NOT zero counters — see HDMI_CLR
|
||
# below for that).
|
||
sudo devmem2 0x40000020 w # note current FRAME_COUNT
|
||
sudo devmem2 0x40000010 w 0x01 # assert RESET — monitor goes black
|
||
sleep 1
|
||
sudo devmem2 0x40000020 w # FRAME_COUNT frozen
|
||
sudo devmem2 0x40000010 w 0x00 # release RESET — monitor repaints
|
||
sudo devmem2 0x40000020 w # FRAME_COUNT resumes counting up
|
||
|
||
# Clear the diagnostic counters via CORE_PULSE.HDMI_CLR (bit 3).
|
||
sudo devmem2 0x40000014 w 0x08
|
||
sudo devmem2 0x40000020 w # FRAME_COUNT — back to 0 (or 1-2 since it's already ticking)
|
||
sudo devmem2 0x40000024 w # DMA_DONE_COUNT — 0
|
||
sudo devmem2 0x40000028 w # RASTER_OVERFLOW_COUNT — 0
|
||
sudo devmem2 0x40000014 w # CORE_PULSE always reads 0 (self-clearing)
|
||
```
|
||
|
||
The CORE_ID read is the "is this actually the ps2 core" handshake;
|
||
mismatch means either the wrong RBF loaded or the bridge isn't
|
||
mapped. The frame counter advancing between two reads (at a real
|
||
~60 frames per second once Ch174 is loaded) proves the PCRTC is
|
||
producing frames even if you can't see the HDMI monitor.
|
||
|
||
### Ch219 — consolidated status block (`ps2_status.sh`)
|
||
|
||
For runtime logs / automation, the operator helper
|
||
[`ps2_status.sh`](ps2_status.sh) reads every bridge register in
|
||
one go and prints a decoded one-screen status block. Copy it to
|
||
the board alongside the RBF and run it after `core_loader.sh
|
||
load` to confirm the fabric is healthy:
|
||
|
||
```bash
|
||
scp docs/hardware/ps2_status.sh terasic@de25:/home/terasic/
|
||
ssh terasic@de25 ./ps2_status.sh # one-shot
|
||
ssh terasic@de25 ./ps2_status.sh --delta # snapshot + 500ms counter Δ
|
||
```
|
||
|
||
Expected output on a healthy boot:
|
||
|
||
```
|
||
retroDE_ps2 core status [snapshot] @ <date>
|
||
================================================
|
||
CORE_ID : 0x50533200 "PS2\0" ✓ ps2 fabric loaded
|
||
ABI_VERSION : 0x00000100 (v1.0) ✓ retroDE ABI v1.0
|
||
CORE_CAPS : 0x00000000 (no caps advertised)
|
||
|
||
CORE_STATUS : 0x0000001F
|
||
[0] loaded : 1
|
||
[1] core_halt : 1 (lit = EE bootlet complete)
|
||
[2] dma_done_seen : 1
|
||
[3] frame_seen : 1 (lit = PCRTC delivered ≥1 frame)
|
||
[4] hdmi_init_done : 1 (lit = ADV7513 LUT walk complete)
|
||
[5] hdmi_i2c_error : 0 ✓ no I²C NACKs
|
||
[6] raster_overflow : 0 ✓ raster healthy
|
||
|
||
CORE_CTRL : 0x00000000
|
||
[0] reset : 0
|
||
[1] rom_loaded : 0
|
||
[2] pause : 0
|
||
|
||
Counters:
|
||
FRAME_COUNT : 12345 (advances at ~60Hz once PCRTC is alive)
|
||
DMA_DONE_COUNT : 1
|
||
RASTER_OVERFLOW_COUNT : 0
|
||
|
||
Counter Δ over 500 ms:
|
||
FRAME_COUNT : 12345 → 12375 Δ=30 (≈ 30 if PCRTC is alive)
|
||
DMA_DONE_COUNT : 1 → 1 Δ=0
|
||
RASTER_OVERFLOW_COUNT : 0 → 0 Δ=0
|
||
```
|
||
|
||
The script's exit status is **0** when CORE_ID matches `0x50533200`
|
||
*and* `hdmi_i2c_error` is clear — suitable for `&&` chaining in
|
||
deployment scripts or CI smoke tests. It uses `busybox devmem`
|
||
internally to side-step the devmem2 access-size quirk on
|
||
`0x?4`-suffixed offsets (see the caveat below).
|
||
|
||
Override the bridge base via env if it's not at the default:
|
||
|
||
```bash
|
||
PS2_BRIDGE_BASE=0x40000000 DEVMEM='busybox devmem' ./ps2_status.sh
|
||
```
|
||
|
||
**Note on devmem2 + `0x?4`-suffixed offsets:** on Linux with this
|
||
particular HPS-side mapping, register reads at offsets ending in
|
||
`0x4` (ABI_VERSION @ 0x40000004, DMA_DONE_COUNT @ 0x40000024)
|
||
throw "Bus error" under `devmem2`. Confirmed quirk in the
|
||
devmem2 access-size handling, not a bridge defect — the bridge
|
||
itself happily decodes `araddr[3:2]=01` lane reads in simulation,
|
||
and the matching `0x?0` / `0x?8` offsets read cleanly on the
|
||
same hardware. Helpers like `busybox devmem` or a small
|
||
`ps2_regs` reader bypass this; for now the runbook reads avoid
|
||
the affected offsets.
|
||
|
||
### Ch220 — retrodesd integration
|
||
|
||
Ch219 covers the standalone `core_loader.sh` + `ps2_status.sh` path.
|
||
The "production" path is retrodesd
|
||
(`retroDE_splash/software/retrodesd.c`), the userspace supervisor
|
||
that owns fabric load, identity validation, the supervisor OSD
|
||
menu, and core-to-core switching. The manifest stanza in the
|
||
"Programming" section above (line 156) is the integration point;
|
||
this sub-section verifies what's sufficient and what isn't.
|
||
|
||
#### `backend=splash` — what's sufficient, what's not
|
||
|
||
With the stanza as written (`backend=splash`,
|
||
`core_id=0x50533200`, `min_abi=0x00000100`), retrodesd will:
|
||
|
||
1. `fabric_load()` the ps2 RBF.
|
||
2. `bridge_wait_ready(2000)` — poll `CORE_STATUS[0] = loaded` (=
|
||
`CORE_STATUS_READY` in `input_common.h`) for up to 2 seconds.
|
||
The Ch173 bridge asserts this bit immediately on fabric
|
||
power-up, so the wait returns within one poll.
|
||
3. `bridge_read_identity()` + `bridge_validate(0x50533200,
|
||
0x00000100)` — reject the load if CORE_ID or ABI don't match
|
||
the stanza.
|
||
4. Call `splash_backend.start()`, which writes OSD `CFG0/CFG1/
|
||
CTRL` at offsets `0x100/0x104/0x108/0x110/0x114` and the
|
||
tile RAM at `0x1000+`.
|
||
|
||
**Steps 1–3 are sufficient for Ch220 bringup** — the ps2 fabric
|
||
exposes CORE_ID + ABI + the Ch174 counters, retrodesd will accept
|
||
the load, HDMI output + the Ch171 quadrant test card will appear,
|
||
and `ps2_status.sh` continues to work over the same bridge.
|
||
|
||
**Step 4 is a silent no-op on the current ps2 fabric.** The
|
||
`ps2_hps_bridge` slave only decodes offsets `0x000–0x028` (see the
|
||
table above) — writes to `0x100+` fall outside the decoded range
|
||
and are dropped on the bridge side. retrodesd never errors
|
||
(`BRESP=OKAY` is returned on AXI4-Lite for any in-window write).
|
||
Consequences:
|
||
|
||
- The "RetroDE / Core Select…" splash OSD does not render over
|
||
the ps2 quadrant image.
|
||
- The supervisor menu (B-button overlay during gameplay) cannot
|
||
be drawn on top of ps2 video. Core-switching itself still
|
||
works, but the operator path is: switch *back to the splash
|
||
core first*, then pick a different core from the supervisor
|
||
there.
|
||
|
||
Adding the OSD canvas to the ps2 fabric (the `0x100+` register
|
||
family plus tile RAM + an overlay path) is a separate effort and
|
||
**not** a Ch220 prerequisite. When it lands, this stanza needs
|
||
no change — `splash_backend` will just start working over ps2
|
||
video automatically.
|
||
|
||
No `retroDE_splash` patch is required for Ch220. retrodesd
|
||
already (a) logs CORE_ID/ABI/STATUS/CAPS for every backend, (b)
|
||
validates against the manifest, and (c) accepts `backend=splash`
|
||
on any ABI v1.0 core. The "splash works for everything that
|
||
exposes the common ABI prefix" contract was the whole point of
|
||
hoisting the identity registers in Ch173.
|
||
|
||
#### Expected retrodesd log lines
|
||
|
||
On a healthy boot of the ps2 core via retrodesd (taken straight
|
||
from `retrodesd.c` + `bridge_common.h` + `splash_backend.c`):
|
||
|
||
```
|
||
retrodesd: loaded N cores from /home/terasic/cores/manifest.cfg
|
||
retrodesd: loading core 'ps2' (PS2 Graphics Demo)
|
||
retrodesd: bridge identity:
|
||
CORE_ID: 0x50533200
|
||
ABI_VERSION: 0x00000100
|
||
CORE_STATUS: 0x0000001F # see Ch219 decode; may print earlier
|
||
CORE_CAPS: 0x00000000
|
||
splash: session initialized
|
||
splash: core started (OSD force-open, supervisor-first)
|
||
retrodesd: service thread started
|
||
```
|
||
|
||
`CORE_STATUS` is timing-dependent — bits 1..3 (`core_halt`,
|
||
`dma_done_seen`, `frame_seen`) usually latch within a few ms of
|
||
fabric load and are typically set by the time retrodesd logs the
|
||
identity, but bit 4 (`hdmi_init_done`) takes ~125 ms for the I²C
|
||
LUT walk and may still be 0 when this line prints. Re-read with
|
||
`ps2_status.sh` after ≥ 200 ms to see the steady-state value
|
||
(`0x0000001F`).
|
||
|
||
Failure-mode log lines to recognize:
|
||
|
||
| Line | Cause |
|
||
|----------------------------------------------------------------------------|----------------------------------------------------------------|
|
||
| `bridge_validate: CORE_ID mismatch: expected 0x50533200, got 0x00000000` | RBF did not load (fabric still at reset) |
|
||
| `bridge_validate: CORE_ID mismatch: expected 0x50533200, got 0x????????` | Wrong RBF loaded — check `rbf=` path in stanza |
|
||
| `bridge_validate: ABI_VERSION too old: need >= 0x00000100, got 0x????????` | ps2 RBF predates the Ch173 ABI v1.0 bridge |
|
||
| `bridge_wait_ready: timeout after 2000 ms` | Fabric load partially failed; `loaded` bit never asserted |
|
||
| `retrodesd: core 'ps2' not found, using first: '...'` | Stanza missing from manifest; check `[ps2]` section header |
|
||
|
||
`journalctl -u retrodesd -f` captures all of the above during
|
||
`core_loader.sh load` *or* during a live core switch.
|
||
|
||
#### Known-good operator checklist
|
||
|
||
After adding the stanza to `/home/terasic/cores/manifest.cfg`
|
||
and restarting `retrodesd.service` (or rebooting), a healthy
|
||
ps2 bringup ticks every box below:
|
||
|
||
| # | Check | How to verify |
|
||
|----|--------------------------------------------------------|----------------------------------------------------------------|
|
||
| 1 | Fabric load returned, SSH session survives | SSH prompt responds; no `dmesg` AXI/SMMU errors |
|
||
| 2 | retrodesd selected the ps2 core | `journalctl -u retrodesd \| grep "loading core 'ps2'"` |
|
||
| 3 | Identity matches manifest | Journal shows `CORE_ID: 0x50533200`, `ABI_VERSION: 0x00000100` |
|
||
| 4 | HDMI locks 640×480 @ ~60 Hz | Monitor reports VGA timing, no "no signal" overlay |
|
||
| 5 | Quadrant test card visible | RED TL, GREEN TR, BLUE BL, WHITE BR |
|
||
| 6 | `ps2_status.sh` exits 0 | `ssh terasic@de25 ./ps2_status.sh && echo OK` |
|
||
| 7 | `FRAME_COUNT` increments at ~60 Hz | `./ps2_status.sh --delta` shows Δ ≈ 30 over 500 ms |
|
||
| 8 | `RASTER_OVERFLOW_COUNT` stays at 0 | `./ps2_status.sh --delta` shows Δ = 0 |
|
||
| 9 | `hdmi_i2c_error` stays clear | `CORE_STATUS[5] = 0` (LED[4] dark, exit status 0) |
|
||
| 10 | No spurious DMA activity | `DMA_DONE_COUNT` increments once on boot, then stays put |
|
||
|
||
A run that ticks all ten boxes is the canonical "Ch220
|
||
known-good". If 1–3 fail, the manifest/stanza is wrong; if 4–5
|
||
fail, suspect the ADV7513 path (Ch219 triage applies); if 6–10
|
||
fail, the bridge or the ps2 design state machines are off —
|
||
`ps2_status.sh` is designed to isolate which.
|
||
|
||
### Ch221 — input/OSD ABI reconnaissance
|
||
|
||
Ch220 established that `backend=splash` is sufficient for fabric
|
||
load + identity. Ch221 surveys what the other retroDE cores
|
||
(NES, GB, A2600, CoCo2) actually decode in their bridges and
|
||
maps the gap to future implementation chapters — **no RTL this
|
||
chapter**, just the table that gives the next code chapter a
|
||
clean target.
|
||
|
||
#### What retrodesd userspace writes/reads
|
||
|
||
From `retroDE_splash/software/{retrodesd.c, splash_backend.c,
|
||
osd_input.c, bridge_common.h, input_common.h}` and the per-core
|
||
backends:
|
||
|
||
| Offset block | Direction | Producer | Purpose |
|
||
|-----------------|-----------|-----------------------------------------|-----------------------------------------------------------|
|
||
| 0x000–0x00F | R | `bridge_read_identity` | CORE_ID / ABI_VERSION / CORE_STATUS / CORE_CAPS |
|
||
| 0x010 | RW | `bridge_set_reset` / `…rom_loaded` | CORE_CTRL: RESET / ROM_LOADED / PAUSE |
|
||
| 0x014 | W | `core_pulse` (`HDMI_CLR`, etc.) | CORE_PULSE: self-clearing pulses |
|
||
| 0x040 / 0x044 | W | `input_thread.c` (gamepad → bridge) | INPUT_P1 / INPUT_P2 — remapped joypad bits for the core |
|
||
| 0x048 | W | `input_thread.c` (raw mirror) | INPUT_P1_RAW — un-remapped buttons for OSD navigation |
|
||
| 0x060 / 0x064 | R | (per-core, optional) | VIDEO_STATUS / HDMI_DIAG |
|
||
| 0x0F0 / 0x0F4 | R | `ds2_poll_thread.c` | DS2 wired controller state |
|
||
| 0x100 | RW | `splash_backend.c` (start) | OSD_CTRL: ENABLE / FORCE_OPEN / INPUT_LOCK |
|
||
| 0x104 | R (poll) | `osd_input.c` | OSD_STATUS: [0]=active, [12:8]=cursor_row |
|
||
| 0x108 | R + W1C | `retrodesd.c` (`trig & ACTION` path) | OSD_TRIGGER: action / back / scroll pending |
|
||
| 0x110 / 0x114 | W | `splash_backend.c` / `osd_setup` | OSD_CFG0 / OSD_CFG1: layout + colors |
|
||
| 0x1000–0x1FFF | W | `osd_draw_*` (supervisor + backend) | Tile RAM: 2048 × 16-bit cells, packed 2/word |
|
||
| 0x100000+ | W | per-core ROM load | ROM staging window (cart, BIOS, OS — not used by ps2 yet) |
|
||
|
||
#### What sibling bridges actually decode
|
||
|
||
The four sibling cores converge on the same map (cross-checked
|
||
against `retroDE_{nes,gb,Atari2600,coco2}/rtl/.../*_hps_bridge.sv`):
|
||
|
||
| Offset block | NES | GB | A2600 | CoCo2 | Notes |
|
||
|-----------------|-----|-----|-------|-------|----------------------------------------------------------------|
|
||
| 0x000–0x01F | ✓ | ✓ | ✓ | ✓ | Identity + CORE_CTRL/PULSE (shared ABI v1.0 prefix) |
|
||
| 0x020–0x03F | ✓ | ✓ | ✓ | ✓ | Core-specific config (mapper flags, GB cfg, CONSOLE_ACTION, …) |
|
||
| 0x040–0x04F | ✓ | ✓ | ✓ | ✓ | INPUT_P1 / INPUT_P2 / INPUT_P1_RAW latches |
|
||
| 0x060–0x07F | ✓ | ✓ | ✓ | ✓ | Video/HDMI diagnostics (read-only) |
|
||
| 0x080–0x09F | ✓ | ✓ † | — | ✓ ‡ | Save status (NES/GB/CoCo2); † GB also serial capture |
|
||
| 0x0F0 / 0x0F4 | ✓ | ✓ | ✓ | ✓ | DS2 wired controller (read-only platform regs) |
|
||
| 0x100–0x11F | ✓ | ✓ | ✓ | ✓ | OSD_CTRL / STATUS / TRIGGER / CFG0 / CFG1 |
|
||
| 0x200–0x21F | ✓ | — | ✓ | — | Savestate (NES=FIFO transport, A2600=BRAM transport) |
|
||
| 0x300–0x33F | ✓ | — | — | ✓ | Diagnostics (read-only); CoCo2 surfaces live keyboard scan |
|
||
| 0x1000–0x1FFF | ✓ | ✓ | ✓ | ✓ | Tile RAM (2048 × 16-bit, packed 2/word) |
|
||
| 0x100000+ | ✓ | ✓ | ✓ | ✓ | ROM staging window |
|
||
|
||
#### What ps2_hps_bridge decodes today
|
||
|
||
`ps2_hps_bridge.sv` accepts AXI4 writes/reads across its full
|
||
window (write FSM returns `BRESP=OKAY` for any address,
|
||
read FSM returns `RRESP=OKAY` with `rdata=0` for any unmapped
|
||
address). Side-effect side-channels are narrow:
|
||
|
||
| Offset block | Behavior on ps2 today |
|
||
|-----------------|--------------------------------------------------------------------------------|
|
||
| 0x000–0x00F | Read-back: CORE_ID=`0x50533200`, ABI=`0x00000100`, STATUS live, CAPS=0 |
|
||
| 0x010 (CTRL) | Read-back live (Ch176); writes latch RESET/ROM_LOADED/PAUSE bits |
|
||
| 0x014 (PULSE) | Reads 0; writes pulse HDMI_CLR (bit 3); other bits acked-and-ignored |
|
||
| 0x020–0x028 | Read-back: FRAME_COUNT / DMA_DONE_COUNT / RASTER_OVERFLOW_COUNT |
|
||
| 0x02C–0x07F | Reads 0; writes acked-and-discarded (`write_in_window` covers the full 128 B) |
|
||
| 0x080–0x07FFFFF | Reads 0; writes acked-and-discarded (outside the side-effect window) |
|
||
|
||
**Bridge behavior consequence**: every retrodesd userspace write
|
||
to OSD registers, tile RAM, or input registers already succeeds
|
||
on the AXI side — nothing crashes. The cost is only behavioral:
|
||
INPUT_P1/P2 latches don't reach a CPU yet, OSD_STATUS reads
|
||
always-0 so retrodesd never sees a pending OSD action, and tile
|
||
RAM writes evaporate. Ch220's "OSD writes are a silent no-op" is
|
||
a property of the *write-decode map*, not an AXI failure.
|
||
|
||
#### Proposed bridge expansion (Ch222+)
|
||
|
||
These are the next decode-side steps, ordered by which one
|
||
unblocks the most runtime functionality per chapter. **No RTL
|
||
this chapter** — Ch221 is just the target spec.
|
||
|
||
| Future Ch | Offset block | Add what | Unblocks |
|
||
|-----------|----------------------|-----------------------------------------------------------------|---------------------------------------------------------------------------------------------------|
|
||
| Ch222 | 0x040 / 0x044 / 0x048 | INPUT_P1 / INPUT_P2 / INPUT_P1_RAW write latches (read-back too) | retrodesd's `input_thread` writes land in addressable regs; later wire to SIO2/DualShock |
|
||
| Ch223 | 0x100 / 0x110 / 0x114 | OSD_CTRL / OSD_CFG0 / OSD_CFG1 latches (compatibility sink) | `splash_backend.start()` writes land in real regs; readable for software introspection |
|
||
| Ch224 | 0x104 / 0x108 | OSD_STATUS (always-0 today) + OSD_TRIGGER (W1C, all-zero source) | retrodesd's OSD-poll loop stops being a no-op contract; future Ch can drive bits from a real FSM |
|
||
| Ch225 | 0x060 / 0x064 | VIDEO_STATUS (HDMI locked, pixel-clock present) read-only | retrodesd journalctl can confirm HDMI lock without `ps2_status.sh` |
|
||
| Ch226 | 0x0F0 / 0x0F4 | DS2_STATUS / DS2_BUTTONS read-only (returns 0 until SIO2 lands) | DS2 poll thread stops reading garbage; future Ch can route into the PS2 SIO2 emulation |
|
||
| Ch227+ | 0x1000–0x1FFF | Tile RAM (BRAM-backed, sink-only at first) | OSD canvas reaches the bridge; rendering path (overlay onto PCRTC scanout) is a separate chapter |
|
||
|
||
The Ch222–Ch224 group is the **compatibility sink** Codex
|
||
flagged in the Ch221 framing. Implementing all three together is
|
||
~30 lines of decode + ~6 32-bit registers + ~3 read-mux arms —
|
||
small enough to land in one chapter if scope allows.
|
||
**Out of scope for now**: tile RAM (4 KB BRAM + overlay path is
|
||
its own chapter), savestate (no PS2 savestate format yet), ROM
|
||
staging at 0x100000+ (BIOS load is currently TB-side via
|
||
`+BIOS=` plusarg; on-board staging is post-MVP).
|
||
|
||
#### Boundary call
|
||
|
||
The honest interface contract after Ch221 is:
|
||
|
||
> **retroDE runtime writes the full ABI v1.0 surface; ps2_hps_bridge
|
||
> currently decodes 0x000–0x028 with side effects and silently
|
||
> accepts the rest; the next implementation chapter (Ch222) adds
|
||
> INPUT_P1/P2 latches, then OSD_CTRL/CFG (Ch223), then OSD_STATUS/
|
||
> TRIGGER (Ch224). Tile RAM and ROM staging are deferred.**
|
||
|
||
That gives the next code chapter a 30-line target instead of a
|
||
guess.
|
||
|
||
### Ch222 — input-latch block (landed)
|
||
|
||
`ps2_hps_bridge` now decodes the shared-ABI input block at
|
||
0x040/0x044/0x048. Each is a 32-bit HPS-visible write/read
|
||
latch; reset clears unconditionally; reads outside this trio in
|
||
the input window still return 0 (e.g. 0x04C).
|
||
|
||
> **⚠ Readback is NOT controller functionality.** The PS2 design
|
||
> has no SIO2 / DualShock pipeline yet, and **nothing on the PS2
|
||
> side consumes these registers** — they are pure HPS-visible
|
||
> latches. Reading INPUT_P1 back gets you the value retrodesd
|
||
> wrote there, *not* a confirmation that the PS2 core observed
|
||
> the gamepad state. Wiring this latch to a future SIO2 emulator
|
||
> is a separate chapter; until then, treat these registers as
|
||
> a documented compatibility surface for retrodesd's
|
||
> `input_thread`, nothing more.
|
||
|
||
Write contract: the bridge accepts AXI writes as full 32-bit
|
||
words at the lane selected by `awaddr[3:2]`; `wstrb` is
|
||
ignored. Partial-byte writes are not supported (this matches the
|
||
sibling cores and the way retroDE userspace actually writes —
|
||
native 32-bit stores from the bridge mmap).
|
||
|
||
Coverage lives in
|
||
[`sim/tb/platform/tb_ps2_hps_bridge.sv`](../../sim/tb/platform/tb_ps2_hps_bridge.sv)
|
||
§11: round-trips with three distinct patterns
|
||
(`0xDEADBEEF` / `0xCAFEBABE` / `0x12345678`), independent-write
|
||
checks (writing one doesn't touch the other two), a tight-decode
|
||
check that writes to 0x04C don't bleed into the adjacent
|
||
INPUT_P1_RAW latch (and that 0x04C itself still reads 0),
|
||
confirmation that CTRL/STATUS/PULSE/counter registers are
|
||
unchanged across the input writes, and a final reset that
|
||
re-clears all three to 0 (other regs already covered by §1–§10).
|
||
|
||
### Ch223 — OSD compatibility sink (landed)
|
||
|
||
`ps2_hps_bridge` now decodes the shared-ABI OSD register block
|
||
at 0x100 / 0x110 / 0x114. Each is a 32-bit HPS-visible write/read
|
||
latch; reset clears unconditionally; reads at the unmapped slots
|
||
inside the OSD window (0x104 / 0x108 / 0x10C / 0x118 / 0x11C)
|
||
return 0, and writes there don't bleed into the mapped latches.
|
||
|
||
> **⚠ Readback is NOT overlay functionality.** The PS2 design
|
||
> has no OSD canvas / overlay engine / tile RAM yet, and
|
||
> **nothing on the PS2 side consumes these registers** — they
|
||
> are pure HPS-visible latches. Reading OSD_CTRL back gets you
|
||
> whatever the splash backend wrote, *not* a confirmation that
|
||
> a "RetroDE / Core Select…" overlay reached the HDMI output.
|
||
> The supervisor menu still cannot render over PS2 video
|
||
> (Ch220's known limitation). What Ch223 *does* fix is the
|
||
> side-effect map: retrodesd's `splash_backend.start()` writes
|
||
> now land in real registers instead of being silently dropped
|
||
> at the bridge.
|
||
|
||
Coverage lives in
|
||
[`sim/tb/platform/tb_ps2_hps_bridge.sv`](../../sim/tb/platform/tb_ps2_hps_bridge.sv)
|
||
§12: reset readback, three round-trips with distinct patterns
|
||
(`0xA5A5A5A5` / `0x11223344` / `0x55667788`), independent-write
|
||
checks across the trio, tight-decode validation that writes to
|
||
0x104 / 0x108 / 0x10C / 0x118 / 0x11C all read 0 and don't
|
||
modify the CTRL/CFG0/CFG1 latches, an HDMI_CLR pulse to absorb
|
||
the post-§11-reset CDC sync-refill artifact, then full
|
||
input-latch + CTRL + STATUS + PULSE + counter readbacks
|
||
confirming Ch223 writes don't disturb prior chapters, and a
|
||
final reset that clears all three OSD latches to 0.
|
||
|
||
Updated boundary call:
|
||
|
||
> **retroDE runtime writes the full ABI v1.0 surface;
|
||
> ps2_hps_bridge decodes 0x000–0x028 + 0x040–0x048 + 0x100 +
|
||
> 0x110 + 0x114 with side effects and silently accepts the
|
||
> rest; next implementation chapter is Ch224 (OSD_STATUS @
|
||
> 0x104, OSD_TRIGGER @ 0x108 with W1C semantics, plus the FSM
|
||
> source that drives them).**
|
||
|
||
### Ch224 — OSD_STATUS / OSD_TRIGGER contracts (landed)
|
||
|
||
Ch224 formalizes the remaining OSD handshake offsets — 0x104,
|
||
0x108, 0x10C — as named ABI contracts rather than "reserved
|
||
placeholders". No new register state: the addresses behave the
|
||
same as in Ch223 (reads return 0, writes accepted-and-ignored),
|
||
but `reg_read()` now explicitly enumerates them and the header
|
||
documents the W1C-sink semantics.
|
||
|
||
| Offset | Contract |
|
||
|--------|-------------------------------------------------------------------------------------------|
|
||
| 0x104 | **OSD_STATUS** — always-zero source. No FSM drives `[0]=active` / `[12:8]=cursor_row` |
|
||
| 0x108 | **OSD_TRIGGER** — W1C-shape sink. retrodesd's poll loop sees `trig=0` and never fires |
|
||
| 0x10C | **OSD_INPUT** — reserved. Sibling cores expose a debug-override; PS2 keeps it dormant |
|
||
|
||
> **⚠ The W1C semantics here are degenerate** because there's no
|
||
> FSM source on the PS2 side: bits never go HIGH from internal
|
||
> state, so any write-to-clear pattern (broad `0xFFFFFFFF`,
|
||
> single-bit `OSD_TRIG_ACTION|row<<8`, or `OSD_TRIG_BACK`) is a
|
||
> no-op against an already-zero register. This is sufficient
|
||
> for retrodesd's poll loop — it reads `trig=0`, never invokes
|
||
> `osd_action`, and stays out of the way. Once a real overlay
|
||
> FSM lands, the W1C clear-mask logic and the FSM SET path will
|
||
> need an actual register; for now the sink contract is the
|
||
> documented Ch224 surface.
|
||
|
||
Coverage lives in
|
||
[`sim/tb/platform/tb_ps2_hps_bridge.sv`](../../sim/tb/platform/tb_ps2_hps_bridge.sv)
|
||
§13: OSD_CTRL/CFG0/CFG1 are first written to known non-zero
|
||
values; then OSD_STATUS is poked with 0xFFFFFFFF and 0x1
|
||
(`active` bit shape), OSD_TRIGGER is poked with three W1C
|
||
shapes (broad `0xFFFFFFFF`, `ACTION|row<<8`, `BACK`), and
|
||
OSD_INPUT is poked with 0xDEADBEEF. After every poke, all three
|
||
target slots still read 0 and OSD_CTRL/CFG0/CFG1 are unchanged.
|
||
§7 also gains an audit sentinel at 0x120 (first byte past the
|
||
OSD window) to prove the `addr[37:5] == 33'h08` decode boundary
|
||
is tight.
|
||
|
||
Updated boundary call:
|
||
|
||
> **retroDE runtime writes the full ABI v1.0 surface;
|
||
> ps2_hps_bridge decodes 0x000–0x028 + 0x040–0x048 + 0x100 +
|
||
> 0x110 + 0x114 with side effects and explicitly contracts
|
||
> 0x104 / 0x108 / 0x10C as zero-source / W1C-sink / reserved;
|
||
> writes outside the decoded set are accepted-and-ignored.
|
||
> The compatibility-sink trilogy Codex flagged in Ch221 is
|
||
> complete. Remaining work is Ch225+ (VIDEO_STATUS reads, DS2
|
||
> stub, tile RAM + overlay).**
|
||
|
||
### Ch225 — VIDEO_STATUS / HDMI_DIAG read surface (landed)
|
||
|
||
After the input + OSD write compatibility, the next most useful
|
||
runtime surface is read-only video/HDMI diagnostics. Ch225 adds
|
||
two pure-read registers at the shared-ABI offsets retrodesd
|
||
already polls on other cores:
|
||
|
||
| Offset | Register | Bit layout |
|
||
|--------|---------------|--------------------------------------------------------------------------------------------------------|
|
||
| 0x060 | VIDEO_STATUS | `[0]=frame_seen` · `[1]=scanout_alive (FRAME_COUNT!=0)` · `[2]=raster_overflow (live)` · `[3]=dma_done_seen` · `[31:4]` reserved |
|
||
| 0x064 | HDMI_DIAG | `[0]=hdmi_init_done` · `[1]=hdmi_i2c_error` · `[31:2]` reserved |
|
||
|
||
Both registers are pure-read views into the synchronized status
|
||
signals the bridge already tracks (the same sources that drive
|
||
CORE_STATUS bits 1..6); writes are accepted with BRESP=OKAY and
|
||
ignored. `scanout_alive` is the only derived bit — it's the
|
||
live `FRAME_COUNT != 32'd0` comparison, so HDMI_CLR drops it
|
||
back to 0 and the next `frame_toggle` edge re-raises it. This
|
||
gives retrodesd or `journalctl`-driven operator tooling a
|
||
one-line "is the display path alive?" answer without depending
|
||
on LEDs or `ps2_status.sh`.
|
||
|
||
> **Note:** Video timing fields (width/height/mode/pixel-clock
|
||
> ID) are deliberately deferred. Sibling-ABI doesn't require
|
||
> them and the 640×480 fixed mode is documented in the LED
|
||
> ledger and the Ch219 status block. When a multi-mode design
|
||
> lands, those fields belong in the `[31:4]` reserved slots
|
||
> here.
|
||
|
||
Coverage lives in
|
||
[`sim/tb/platform/tb_ps2_hps_bridge.sv`](../../sim/tb/platform/tb_ps2_hps_bridge.sv)
|
||
§14: entry-state read confirms VIDEO_STATUS = 0x0B and
|
||
HDMI_DIAG = 0x03 (frame_seen + scanout_alive + dma_done; init +
|
||
err); HDMI_CLR drops `scanout_alive` to 0 → 0x09; a single
|
||
`frame_toggle` flip re-raises it → 0x0B; pulsing
|
||
`raster_overflow` HIGH/LOW exercises bit [2]'s live tracking
|
||
(0x0B → 0x0F → 0x0B); broad write patterns (`0xFFFFFFFF`,
|
||
`0x00000000`, `0xDEADBEEF`) at 0x060 and 0x064 confirm the
|
||
read-only contract; Ch222 input latches + Ch223 OSD_CTRL/CFG0/
|
||
CFG1 remain unchanged across the Ch225 writes.
|
||
|
||
Updated boundary call:
|
||
|
||
> **ps2_hps_bridge now decodes 0x000–0x028 + 0x040–0x048 +
|
||
> 0x060 / 0x064 + 0x100/0x110/0x114 with side effects, and
|
||
> explicitly contracts 0x104 / 0x108 / 0x10C as zero-source /
|
||
> W1C-sink / reserved. Compatibility-sink trilogy plus the
|
||
> diagnostic read surface (Ch222–Ch225) are complete. Remaining
|
||
> ABI v1.0 surfaces are Ch226 (DS2 stub at 0x0F0/0x0F4) and
|
||
> Ch227+ (tile RAM + overlay path — the only large RTL item
|
||
> left in the shared map).**
|
||
|
||
### Ch226 — DS2 wired-controller stub (landed)
|
||
|
||
`ps2_hps_bridge` now exposes the shared-ABI DS2 platform offsets
|
||
at 0x0F0 / 0x0F4 as a read-only stub. PS2 has no physical DS2
|
||
wired-controller path today (DE25-Nano routes the DS2 port
|
||
elsewhere on other cores), so the stub honors the sibling-ABI
|
||
contract retrodesd's `ds2_poll_thread.c` consumes:
|
||
|
||
| Offset | Register | Layout (sibling-ABI compatible) |
|
||
|--------|--------------|------------------------------------------------------------------------------------------------------------------|
|
||
| 0x0F0 | DS2_STATUS | `[0]=connected=0` (no platform path) · `[1]=error=0` · `[2]=input_latches_valid=1` (PS2-local) · `[31:3]=0` → `0x00000004` |
|
||
| 0x0F4 | DS2_BUTTONS | Read-only **mirror of INPUT_P1** (`0x040`). Tracks Ch222 latch updates in real time |
|
||
|
||
> **⚠ Sibling-ABI deviation note.** Codex's Ch226 framing
|
||
> originally proposed `[1]=input_latches_valid`. That conflicts
|
||
> with the shared `[1]=error` bit consumed by retrodesd's poll
|
||
> thread (`error = (status >> 1) & 0x1`) — returning 1 there
|
||
> would cause retrodesd to call `osd_input_disconnect_ds2()`
|
||
> every cycle. Following Codex's own recon-then-document hedge
|
||
> ("If sibling precedent exists, follow it"), the PS2-local
|
||
> diagnostic was moved to bit `[2]`, keeping `[0]/[1]` aligned
|
||
> with the sibling contract. The result: retrodesd sees a
|
||
> sibling-shaped "no DS2 connected" answer and bows out of the
|
||
> DS2 path cleanly, while operator tooling can still read
|
||
> `[2]` for the PS2-local "latches present" hint.
|
||
|
||
DS2_BUTTONS mirrors `INPUT_P1` so HPS can confirm a Ch222 store
|
||
landed by reading at either 0x040 or 0x0F4 — useful when
|
||
diagnosing the controller-state path before any SIO2 wiring
|
||
exists. Note this is read-only: writes to 0x0F4 do **not**
|
||
modify INPUT_P1.
|
||
|
||
Coverage lives in
|
||
[`sim/tb/platform/tb_ps2_hps_bridge.sv`](../../sim/tb/platform/tb_ps2_hps_bridge.sv)
|
||
§15: DS2_STATUS reset/entry = `0x00000004`; INPUT_P1 → DS2_BUTTONS
|
||
mirror with two distinct write patterns; verification that
|
||
INPUT_P2 / INPUT_P1_RAW writes do NOT touch DS2_BUTTONS (mirror
|
||
is INPUT_P1-only); broad write patterns at 0x0F0 / 0x0F4 confirm
|
||
read-only contract — critically including a check that writing
|
||
0x0F4 does NOT modify the underlying INPUT_P1 latch; Ch222–Ch224
|
||
state unchanged across Ch226 writes; reset clears INPUT_P1 →
|
||
DS2_BUTTONS reads 0 while DS2_STATUS stays constant at `0x4`.
|
||
|
||
§7 also picks up audit sentinels at 0x0E0 (just before DS2) and
|
||
0x0F8 (the CoCo2-specific DS2_ANALOG slot, unmapped on PS2) to
|
||
confirm the Ch226-widened 256-byte window's decode is tight.
|
||
|
||
**RTL note:** Ch226 widened the first decode window from 128 B
|
||
(`addr[37:7]=='0`) to 256 B (`addr[37:8]=='0`) and switched the
|
||
case selector from `addr[6:2]` (5-bit) to `addr[7:2]` (6-bit).
|
||
Existing slot indices (e.g. CORE_CTRL @ 6'h04 = 0x010) keep
|
||
their values since the new high bit is implicitly 0 for offsets
|
||
≤ 0x07F. The OSD window at 0x100–0x11F continues to use its
|
||
own `addr[37:5]==33'h08` decode in the `else if` branch.
|
||
|
||
Updated boundary call:
|
||
|
||
> **ps2_hps_bridge now decodes 0x000–0x028 + 0x040–0x048 +
|
||
> 0x060 / 0x064 + 0x0F0 / 0x0F4 + 0x100 / 0x110 / 0x114 with
|
||
> side effects (or read-only data); explicitly contracts
|
||
> 0x104 / 0x108 / 0x10C as zero-source / W1C-sink / reserved.
|
||
> The entire ABI v1.0 register-shape surface that sibling
|
||
> cores expose is now present on PS2, either functionally or
|
||
> as a stub. The only major remaining shared-ABI surface is
|
||
> Ch227+ (tile RAM at 0x1000–0x1FFF + the overlay rendering
|
||
> path), which is a separate RTL/PCRTC effort, not a bridge
|
||
> decode chapter.**
|
||
|
||
### Ch227 — tile RAM storage surface (landed)
|
||
|
||
Per Codex's "split it in two stages" framing, Ch227 implements
|
||
**only the tile RAM ABI/storage surface** — no overlay rendering,
|
||
no PCRTC composition. The 4 KB window at 0x1000–0x1FFF becomes a
|
||
plain 1024 × 32-bit RAM owned by the bridge; HPS writes land in
|
||
the memory, reads return the last value written. Ch228 (a future
|
||
chapter) will wire this storage into a PCRTC overlay composition
|
||
path.
|
||
|
||
**Storage shape choice.** Sibling bridges (NES / A2600 / CoCo2)
|
||
do NOT own their tile RAM — they forward writes to an external
|
||
overlay engine via `tile_wr_addr[10:0] / tile_wr_data[15:0] /
|
||
tile_wr_en` output ports, splitting each 32-bit AXI write into
|
||
two 16-bit cell writes (matching `input_common.h`'s "2048 ×
|
||
16-bit cells packed 2/word" software view). PS2 has no overlay
|
||
engine yet, so Ch227 instead owns the storage internally as a
|
||
flat 1024 × 32-bit memory. Software still sees the same bytewise
|
||
contents — at the bus level it's just transparent
|
||
write-then-readback. The 16-bit-cell software view is a
|
||
software-side convention preserved by `osd_putchar` etc.,
|
||
unaffected by whether the bridge backs it as 16-bit cells or
|
||
32-bit words.
|
||
|
||
**Reset semantics.** No sibling precedent for "tile RAM clears
|
||
on reset" (siblings don't own the storage), so Ch227 makes it
|
||
**retained** across warm reset. Sim deterministically pre-zeros
|
||
the memory in an `initial` block; on hardware the power-up
|
||
value is undefined until the overlay chapter pins a reset
|
||
contract (BRAM on Agilex 5 doesn't naturally sync-reset every
|
||
word; we'll let Ch228 decide whether a write-1-to-clear or an
|
||
explicit clear-pulse register is the right fit).
|
||
|
||
**Decode shape.** Uses a third `else if` branch in `reg_read()`
|
||
(beside the 0x000–0x0FF and 0x100–0x11F branches) guarded by
|
||
`addr[37:12] == 26'h1`; word index = `addr[11:2]`. WSTRB is
|
||
ignored (full-word writes only, same contract as the rest of
|
||
the bridge). The two existing decode windows (Ch222–Ch226
|
||
shared-ABI prefix, Ch223 OSD sink) are unchanged.
|
||
|
||
Coverage lives in
|
||
[`sim/tb/platform/tb_ps2_hps_bridge.sv`](../../sim/tb/platform/tb_ps2_hps_bridge.sv)
|
||
§16: reset reads at the base (0x1000), middle (0x1800), and
|
||
last word (0x1FFC) all return 0; round-trip with three distinct
|
||
patterns (`0xCAFEF00D` / `0x12345678` / `0xDEADBEEF`);
|
||
independent-slot checks (each write hit only its target);
|
||
adjacent-slot tightness (slots at +4 / -4 from each probed
|
||
target remain 0); **boundary sentinels at 0x0FFC and 0x2000**
|
||
read 0, and writes to those boundary addresses don't bleed into
|
||
the nearest tile slots; existing CTRL/STATUS/INPUT/OSD/DS2 regs
|
||
are unchanged across the Ch227 writes; and a reset-retention
|
||
check writes the three patterns, pulses `reset_n`, and verifies
|
||
all three values survive (proving the "retained across warm
|
||
reset" contract).
|
||
|
||
Updated boundary call:
|
||
|
||
> **ps2_hps_bridge ABI v1.0 register decode is now complete:
|
||
> 0x000–0x028 + 0x040–0x048 + 0x060 / 0x064 + 0x0F0 / 0x0F4 +
|
||
> 0x100 / 0x110 / 0x114 with side effects; 0x104 / 0x108 /
|
||
> 0x10C as zero-source / W1C-sink / reserved; 0x1000–0x1FFF as
|
||
> 4 KB tile-RAM storage. The remaining piece of the shared ABI
|
||
> v1.0 surface is *behavior*, not decode: Ch228 wires tile RAM
|
||
> into a PCRTC overlay composition path so the supervisor menu
|
||
> can actually render over PS2 video. Beyond that, retroDE_ps2
|
||
> is feature-parity with the sibling cores on bridge ABI shape.**
|
||
|
||
### Ch228 — overlay engine skeleton (landed)
|
||
|
||
Per Codex's "split it in two stages" guidance, Ch228 is the
|
||
**first behavior chapter** past the ABI shape work, and stays
|
||
intentionally narrow: it builds the video-domain overlay
|
||
compositor + a built-in test-pattern source, but **does not yet
|
||
consume the Ch227 tile RAM**. The bridge↔video CDC question is
|
||
deferred to Ch229.
|
||
|
||
**New module:**
|
||
[`rtl/platform/osd_overlay_stub.sv`](../../rtl/platform/osd_overlay_stub.sv).
|
||
Inputs: 8-bit RGB + DE + HS + VS at `design_clk`. Outputs: same
|
||
shape, with the overlay composited over the incoming pixels when
|
||
the region check fires. Sync signals (DE/HS/VS) pass through
|
||
unchanged. The composite priority is **OSD over PS2 video** — a
|
||
pixel inside the 160×48 top-left region replaces the PS2 RGB;
|
||
elsewhere the input flows through.
|
||
|
||
**Test pattern.** 8×8 black/white checker via `x_cnt[3] ^
|
||
y_cnt[3]`. Picked because every (x, y) inside the box has a
|
||
deterministic expected color (the TB samples seven specific
|
||
positions including the four corners of the box), and the
|
||
boundary at x=160 / y=48 is visually distinct from any
|
||
PCRTC-generated PS2 pixel.
|
||
|
||
**Enable is parameter-only this chapter.** `OSD_ENABLE_DEFAULT`
|
||
is a compile-time bit. The TB instantiates two DUTs — one with
|
||
`OSD_ENABLE_DEFAULT=1'b1`, one with `=1'b0` — to exercise both
|
||
sides of the muxing decision. Wiring a real, safely-synchronized
|
||
`OSD_CTRL[0]` from the bridge clock into design_clk is its own
|
||
decision (the bridge writes `OSD_CTRL` on `clk` per Ch223; the
|
||
overlay reads on `design_clk`) and lands with Ch229 alongside
|
||
the tile-RAM CDC.
|
||
|
||
**x / y derivation from sync signals.** `x_cnt` increments each
|
||
clock during active video (`in_de=1`) and clamps to 0 during
|
||
blanking; `y_cnt` resets on the vsync de-assert edge (end of
|
||
sync pulse → start of vertical back porch) and increments on
|
||
each DE falling edge. With this scheme `x_cnt == hcnt` during
|
||
every active line and `y_cnt == active-line index within frame`
|
||
starting from 0 — which makes region detection a simple range
|
||
compare on signed (or, here, unsigned-wide-enough) counters.
|
||
|
||
> **⚠ This chapter does not connect to the bridge.** The
|
||
> `ps2_hps_bridge.tile_mem` Ch227 storage is untouched; the
|
||
> bridge runs on the bridge clock and `osd_overlay_stub` lives
|
||
> in `design_clk`. Crossing those two domains safely — dual-port
|
||
> RAM, snapshot buffer, or a write-FIFO with a domain-crossing
|
||
> read window — is a Ch229 design decision. Ch228's job is only
|
||
> to prove the video-side composition path works: z-order, RGB
|
||
> muxing, scanout timing, and "supervisor menu *can* physically
|
||
> appear over PS2 video" with no entanglement of CDC.
|
||
|
||
**Top-level integration deferred.** The board top
|
||
[`de25_nano_psmct32_raster_demo_top.sv`](../../rtl/top/de25_nano_psmct32_raster_demo_top.sv)
|
||
is **not** modified this chapter — the overlay module exists as
|
||
a standalone unit verified by its own TB. Wiring it in between
|
||
the inner wrapper's `VIDEO_R/G/B/DE/HS/VS` and `HDMI_TX_*`
|
||
naturally pairs with Ch229's CDC decision (since "do we use the
|
||
test pattern or the real tile RAM?" only has a meaningful answer
|
||
once tile RAM is reachable from `design_clk`).
|
||
|
||
**TB**
|
||
[`sim/tb/platform/tb_osd_overlay_stub.sv`](../../sim/tb/platform/tb_osd_overlay_stub.sv)
|
||
(new, registered in both Makefile lists per the regression
|
||
contract). Drives a tiny VGA-shape raster (200×60 visible, 8-cycle
|
||
HSYNC, 2-line VSYNC) directly into two parallel DUT instances.
|
||
Verifies:
|
||
|
||
1. DE / HS / VS pass through both DUTs unchanged every cycle.
|
||
2. With overlay disabled, every active pixel matches the PS2
|
||
background (sampled at four corners + the would-be OSD
|
||
center).
|
||
3. With overlay enabled, eight points across the OSD region
|
||
match the checker pattern (`x[3]^y[3]`), including the
|
||
tight-boundary samples at (159, 0), (0, 47), and (159, 47).
|
||
4. The region boundary is tight: (160, 0) and (0, 48) — just
|
||
past the region — return PS2 background.
|
||
5. Across a second consecutive frame, samples at (0, 0) and
|
||
(8, 0) still match the checker pattern, proving y_cnt
|
||
correctly resets on the vsync de-assert edge of every frame.
|
||
|
||
Updated boundary call:
|
||
|
||
> **Ch228 lands the video-domain overlay engine skeleton +
|
||
> test-pattern source as a standalone module + focused TB. The
|
||
> compatibility-sink + diagnostic + DS2 + tile-RAM-storage
|
||
> chapters (Ch222–Ch227) gave HPS-visible state; Ch228 adds
|
||
> the first piece of design-side behavior — z-order and RGB
|
||
> muxing on the video-clock domain. Remaining: Ch229 ties the
|
||
> two together — bridge tile RAM read port + CDC into
|
||
> `design_clk` + board-top integration so HPS-written glyphs
|
||
> can finally render over PS2 video.**
|
||
|
||
### Ch229 — HPS-written tiles render on video (landed)
|
||
|
||
Ch229 closes the loop: bridge tile RAM writes from HPS now reach
|
||
the `osd_overlay_stub` through a toggle-based CDC into the
|
||
design clock domain, where a 1024×32-bit **shadow RAM** mirrors
|
||
the bridge-side `tile_mem`. The overlay reads the shadow at
|
||
pixel rate, draws an 8x8 checker block where a tile word is
|
||
non-zero, and stays transparent where it is zero. The board top
|
||
is rewired so this composition path sits between `u_demo` and
|
||
both `VIDEO_*` (test inspection) and `HDMI_TX_*` (HDMI pin out).
|
||
|
||
#### Pipeline
|
||
|
||
```
|
||
HPS (mmap'd /dev/mem at 0x40001000+)
|
||
│ axi_write32(0x1000 + idx*4, tile_word)
|
||
▼
|
||
ps2_hps_bridge (CLOCK2_50 domain)
|
||
├─ tile_mem[idx] <= wdata_lane // Ch227 AXI-readback storage
|
||
├─ tile_wr_index <= aw_addr_q[11:2] // Ch229 broadcast
|
||
├─ tile_wr_data <= wdata_lane
|
||
└─ tile_wr_toggle <= ~tile_wr_toggle // event edge
|
||
│
|
||
▼ (across clk domains)
|
||
tile_ram_cdc (design_clk domain)
|
||
├─ 2-FF synchronizer on bclk_wr_toggle
|
||
├─ wr_pulse = toggle_sync[2] ^ toggle_sync[1] // 1-cycle pulse
|
||
├─ shadow_mem[bclk_wr_index] <= bclk_wr_data // gated by pulse
|
||
└─ dclk_rd_data = shadow_mem[dclk_rd_index] // combinational read
|
||
│
|
||
▼
|
||
osd_overlay_stub (design_clk domain)
|
||
├─ tile_x = x_cnt[8:3], tile_y = y_cnt[6:3]
|
||
├─ tile_rd_index = {tile_y[3:0], tile_x[5:0]} // tile_y*64 + tile_x
|
||
├─ tile_opaque = (tile_rd_data != 0)
|
||
├─ osd_pixel_on = x_cnt[3] ^ y_cnt[3] // 8x8 checker
|
||
└─ draw_osd = in_de && in_osd_region && tile_opaque
|
||
│
|
||
▼ (between u_demo and VIDEO_*/HDMI_TX_*)
|
||
VIDEO_R/G/B (top-level test pins) + HDMI_TX_D (ADV7513)
|
||
```
|
||
|
||
#### CDC contract (read this before touching the CDC)
|
||
|
||
The bridge updates `tile_wr_toggle`, `tile_wr_index`, and
|
||
`tile_wr_data` on the same `clk` (CLOCK2_50) edge per AXI
|
||
write. `tile_ram_cdc` runs a 3-stage shift register on the
|
||
toggle, detects edges via XOR of stages [2] and [1], and uses
|
||
that edge as a 1-cycle write enable on the design-domain
|
||
shadow RAM. By the time the edge fires, `bclk_wr_index` and
|
||
`bclk_wr_data` have been stable for ≥ 2 dclk cycles — more
|
||
than enough at the production clock ratio (50 MHz bridge,
|
||
25 MHz design = 2:1) and any reasonable HPS write rate.
|
||
|
||
The contract is "writes must be spaced ≥ 6 dclk cycles apart"
|
||
— well above retrodesd's ~1 kHz update rate (the slowest dclk
|
||
is ~25 MHz = 40 ns, so 6 dclk = 240 ns ≪ 1 ms). If a future
|
||
chapter wants a fast-cycling write source (e.g. streaming
|
||
animation from the design side), it must replace this toggle
|
||
CDC with an async FIFO. Documented in
|
||
[`rtl/platform/tile_ram_cdc.sv`](../../rtl/platform/tile_ram_cdc.sv)
|
||
header.
|
||
|
||
#### Reset behavior
|
||
|
||
Both domains share the FPGA configure reset path. Bridge
|
||
`tile_wr_toggle` clears to 0 on `reset_n` deasserted; the
|
||
receiver's `toggle_sync` clears to 0 on `dreset_n` deasserted.
|
||
When both go through reset together — the normal case — no
|
||
spurious post-reset edge fires. The bridge-side `tile_mem` and
|
||
the design-side `shadow_mem` retain their contents across warm
|
||
reset; sim's `initial` blocks zero both for deterministic
|
||
testing; hardware power-up is undefined for both until a future
|
||
chapter pins a contract (e.g. by adding a "clear tile RAM"
|
||
pulse register or a power-on bring-up sequence in retrodesd).
|
||
|
||
#### Top-level wiring
|
||
|
||
[`rtl/top/de25_nano_psmct32_raster_demo_top.sv`](../../rtl/top/de25_nano_psmct32_raster_demo_top.sv)
|
||
gains:
|
||
|
||
1. Internal wires `demo_video_r/g/b/de/hsync/vsync` for
|
||
`u_demo`'s raw scanout (renamed from direct VIDEO_*
|
||
connections).
|
||
2. `u_tile_cdc` instance bridging CLOCK2_50 → design_clk.
|
||
3. `u_osd_overlay` instance between `demo_video_*` and the
|
||
`VIDEO_*` top-level outputs. The HDMI_TX_* path picks up
|
||
the overlay composite for free (it was already wired to
|
||
`VIDEO_*`).
|
||
4. `ps2_hps_bridge` instantiation extended with the new
|
||
`tile_wr_toggle/index/data` ports inside `USE_QSYS_TOP`;
|
||
safe tie-offs (`toggle=0`, `index=0`, `data=0`) for the
|
||
sim path that lacks the qsys instantiation.
|
||
|
||
The overlay uses `OSD_ENABLE_DEFAULT=1'b1` in the top instance
|
||
— the overlay path is *active*, but transparent until HPS
|
||
writes a non-zero tile (production startup behavior is
|
||
identical to pre-Ch229 because both `tile_mem` and `shadow_mem`
|
||
start at zero). Ch230 will replace the parameter with a
|
||
properly-synchronized `OSD_CTRL[0]` bit so retrodesd can gate
|
||
the overlay explicitly.
|
||
|
||
#### Coverage
|
||
|
||
Two TBs now exercise the new path:
|
||
|
||
| TB | Scope |
|
||
|-----------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
||
| [`sim/tb/platform/tb_tile_ram_cdc.sv`](../../sim/tb/platform/tb_tile_ram_cdc.sv) (new) | Two distinct clocks (100 MHz bclk, 33 MHz dclk). Single-write propagation at base / mid / end; multi-write same-index "last wins"; eight distinct-index writes; static-toggle-no-write check (writes are edge-triggered, not level-triggered). |
|
||
| [`sim/tb/platform/tb_osd_overlay_stub.sv`](../../sim/tb/platform/tb_osd_overlay_stub.sv) (updated for Ch229 behavior) | Mock shadow RAM driven by TB. All-zero tiles → enabled DUT shows PS2 video (Ch229 default). Setting tile (0, 0) lights only that 8x8 block. Setting tile (1, 0) lights the next block; (0, 0) still holds. Clearing tile (0, 0) makes it transparent again while (1, 0) stays. Tight boundary at x=160/y=48 stays transparent regardless of tile data. Frame re-sync confirms y_cnt resets correctly on the next vsync. Disabled DUT ignores tile content entirely. |
|
||
|
||
Top-level
|
||
[`tb_de25_nano_psmct32_raster_demo_top.sv`](../../sim/tb/top/tb_de25_nano_psmct32_raster_demo_top.sv)
|
||
still passes with the overlay wired in — VIDEO_DE is the only
|
||
signal it samples directly, and the overlay forwards DE
|
||
unchanged. The full 149-baseline regression bumps to 151 PASS
|
||
with both new TBs registered in the Makefile (per-target rule
|
||
+ `run:` master list + build-aggregation list, per the
|
||
two-lists contract).
|
||
|
||
Updated boundary call:
|
||
|
||
> **The Ch229 close-the-loop chain is live: HPS writes tile
|
||
> RAM → bridge broadcasts the write → toggle-CDC propagates
|
||
> into design_clk → shadow RAM stores → overlay reads + draws
|
||
> over PS2 video. Production behavior is unchanged at startup
|
||
> (all tiles zero → overlay transparent), and the path is
|
||
> ready for retrodesd to start sending real OSD glyphs.
|
||
> Remaining work: Ch230 — synchronize `OSD_CTRL[0]` from
|
||
> bridge clk → design_clk so HPS can gate the overlay
|
||
> explicitly, plus Ch231+ font / glyph rendering work for
|
||
> "real menu" visuals.**
|
||
|
||
#### Audit follow-ups (per Codex Ch229 audit)
|
||
|
||
**Tile-write rate watchdog (sim-only).** Codex flagged that the
|
||
CDC contract ("writes ≥ 6 dclk apart") was documented but not
|
||
enforced in RTL — a sufficiently fast back-to-back AXI write
|
||
could merge two toggle edges into one and silently lose a
|
||
write. The actual safe minimum is **≥ 3 dclk** between
|
||
consecutive receiver `wr_pulse` events (1 dclk for sync chain
|
||
settling + 1 dclk for the first pulse to fire + 1 dclk of
|
||
jitter margin). A `\`ifndef SYNTHESIS`-guarded watchdog in
|
||
[`tile_ram_cdc.sv`](../../rtl/platform/tile_ram_cdc.sv) tracks
|
||
the dclk gap between successive `wr_pulse` events and prints
|
||
a one-line warning if the gap is below threshold. Real-world
|
||
retrodesd OSD updates at ≤ 1 kHz are millions of dclk apart at
|
||
25 MHz — many orders of magnitude of slack. **Hardware
|
||
visibility (an AXI-readable diagnostic counter that retrodesd
|
||
could poll to detect lost-write events) is deferred to Ch230**;
|
||
this turn just makes pre-silicon violations loud in the sim
|
||
log.
|
||
|
||
**Reset-retention asymmetry caveat.** The Ch227 bridge
|
||
`tile_mem` and the Ch229 design-domain `shadow_mem` are *both*
|
||
retained across warm reset (no clear logic, sim `initial`
|
||
zero-fills for determinism, hardware power-up undefined for
|
||
both). The two RAMs **stay in sync** as long as both go
|
||
through reset together — the normal case for FPGA configure.
|
||
The risk Codex flagged is a partial reset path: if the design
|
||
domain resets independently (e.g., HPS asserts `CORE_CTRL[0]`
|
||
via Ch176, which routes through the design-side reset chain),
|
||
the bridge's `tile_mem` survives but the shadow's contents
|
||
also survive (no clear) — they stay synchronized. The only
|
||
mismatch scenario is if HPS *clears* tile_mem during a design
|
||
reset and the design side misses the propagation: then the
|
||
bridge's view of "this tile is zero" while the shadow still
|
||
holds the old non-zero value, and the overlay would draw a
|
||
stale tile until HPS rewrites. Mitigation: retrodesd should
|
||
treat any core-reset path as "must rewrite the OSD" — the same
|
||
contract sibling cores honor for tile RAM. A formal **resync
|
||
strategy** (e.g., a bridge-issued bulk-clear pulse after
|
||
`core_reset_req` deassert) is post-MVP polish.
|
||
|
||
**Sim regression count math.** 149 (Ch227 baseline) + 1
|
||
(`tb_osd_overlay_stub` from Ch228, first new TB) + 1
|
||
(`tb_tile_ram_cdc` from Ch229) = **151**. The updated
|
||
`tb_osd_overlay_stub` (now drives mock tile RAM instead of
|
||
checker assertions) is the *same* TB target — internal asserts
|
||
changed, but it still emits one `PASS` line. ✓
|
||
|
||
### Ch230 — OSD_CTRL[0] CDC + diagnostic counter (landed)
|
||
|
||
Ch230 finishes the close-the-loop chain by wiring the
|
||
`OSD_CTRL[0]` enable bit from the bridge clock domain into the
|
||
design clock domain, and promotes Ch229's sim-only watchdog
|
||
into a real saturating counter. retrodesd can now turn the
|
||
overlay on/off through the ABI bit it already writes.
|
||
|
||
#### Pipeline (new pieces in **bold**)
|
||
|
||
```
|
||
HPS axi_write32(0x100, OSD_CTRL_ENABLE | …)
|
||
│
|
||
▼
|
||
ps2_hps_bridge (CLOCK2_50 domain)
|
||
├─ osd_ctrl_q[0] // Ch223 store
|
||
└─ **osd_ctrl_enable = osd_ctrl_q[0]** // Ch230 output
|
||
│
|
||
▼ (cross-domain)
|
||
**3-FF synchronizer in the board top (design_clk)**
|
||
│
|
||
▼
|
||
**osd_overlay_stub.enable_i** // Ch230 input
|
||
│
|
||
▼
|
||
draw_osd = in_de && in_osd_region && tile_opaque
|
||
&& (enable_i || OSD_ENABLE_DEFAULT)
|
||
```
|
||
|
||
#### Why a 3-FF synchronizer
|
||
|
||
The OSD_CTRL[0] bit changes at HPS speeds (microseconds at the
|
||
very fastest). A 3-FF level synchronizer in the destination
|
||
domain is the standard low-MTBF answer for a single-bit
|
||
control signal — the first two stages absorb metastability;
|
||
the third stage gives the consumer a fully-settled value to
|
||
gate combinational logic with. We're not crossing a multi-bit
|
||
bus (the tile CDC is the only multi-bit crossing) so no
|
||
handshake or async-FIFO is needed here.
|
||
|
||
Reset behavior: both `bridge_osd_ctrl_enable` (`osd_ctrl_q[0]`)
|
||
and the `overlay_enable_sync` shift register clear to 0 on
|
||
their respective resets. When both reset together (the normal
|
||
FPGA-configure case), the synced enable starts at 0 and stays
|
||
there until HPS writes a non-zero value to 0x100. **Production
|
||
default with Ch230 is overlay OFF**, which differs from Ch229's
|
||
"always-on, transparent-until-tile-set" parameter default —
|
||
retrodesd must explicitly assert the bit before its OSD shows.
|
||
|
||
#### Diagnostic counter
|
||
|
||
Ch229's sim-only `$display` watchdog is now backed by a real
|
||
**16-bit saturating counter** in `tile_ram_cdc`:
|
||
`tile_wr_too_close_count`. Increments every dclk cycle that
|
||
`wr_pulse` fires within `MIN_DCLK_GAP=3` cycles of the
|
||
previous one. Saturates at `0xFFFF` (never wraps). The `$display`
|
||
line still fires in `\`ifndef SYNTHESIS` for log-grep
|
||
visibility.
|
||
|
||
For Ch230, the counter is **internal-only** at the top
|
||
(`tile_wr_too_close_count` is connected to a local wire but
|
||
not routed to a bridge-readable register). Exposing it for
|
||
HPS readback requires a reverse CDC (design_clk → CLOCK2_50)
|
||
plus a new diagnostic register slot — Codex's framing
|
||
explicitly deferred that to **Ch231+**: "If exposing it to HPS
|
||
requires a reverse CDC, defer readback to Ch231 and just keep
|
||
the internal counter plus sim/RTL comment in Ch230." The
|
||
counter is positioned for that future hookup.
|
||
|
||
#### Top-level wiring
|
||
|
||
[`de25_nano_psmct32_raster_demo_top.sv`](../../rtl/top/de25_nano_psmct32_raster_demo_top.sv)
|
||
adds:
|
||
|
||
1. `bridge_osd_ctrl_enable` wire (from `ps2_hps_bridge.osd_ctrl_enable`).
|
||
2. `overlay_enable_sync[2:0]` 3-FF shift register on
|
||
`design_clk`; `overlay_enable = overlay_enable_sync[2]`.
|
||
3. `u_osd_overlay` re-instantiated with `.OSD_ENABLE_DEFAULT(1'b0)`
|
||
and `.enable_i(overlay_enable)` — runtime path replaces the
|
||
Ch229 parameter-always-on default.
|
||
4. `tile_wr_too_close_count[15:0]` wire surfaced from
|
||
`u_tile_cdc` (currently unconnected sink, ready for Ch231+
|
||
reverse CDC).
|
||
5. Sim-path tie-off: `assign bridge_osd_ctrl_enable = 1'b0;`
|
||
alongside the existing Ch229 tile-broadcast tie-offs in the
|
||
non-`USE_QSYS_TOP` branch.
|
||
|
||
#### Coverage
|
||
|
||
| TB | Scope |
|
||
|-----------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
||
| [`tb_osd_overlay_stub.sv`](../../sim/tb/platform/tb_osd_overlay_stub.sv) (§10 added) | Third DUT with `OSD_ENABLE_DEFAULT=0` + TB-driven `runtime_enable`. Verifies: tiles present + enable=0 → PS2 background; enable=1 → checker lights up; zero tiles stay transparent even when enabled; enable=0 again → back to PS2. Existing §1–§9 still pass with the new enable input wired to constants. |
|
||
| [`tb_tile_ram_cdc.sv`](../../sim/tb/platform/tb_tile_ram_cdc.sv) (§9 added) | New `too_close_count` output port consumed. Confirms counter stays at 0 across the eight safely-spaced writes in §2–§7, then **deliberately violates** the rate with 10 toggle flips on consecutive bclk edges and verifies the counter is non-zero. A second 20-flip burst proves monotonic non-decreasing (saturation). |
|
||
| [`tb_ps2_hps_bridge.sv`](../../sim/tb/platform/tb_ps2_hps_bridge.sv) (unused-output declaration added) | Bridge TB gains a `logic osd_ctrl_enable` declaration to satisfy the `.*` wildcard wiring with the new output port. No behavioral changes — the Ch223 OSD_CTRL store and Ch230 surface-output share the same `osd_ctrl_q[0]` source verified by §12. |
|
||
|
||
Top-level
|
||
[`tb_de25_nano_psmct32_raster_demo_top.sv`](../../sim/tb/top/tb_de25_nano_psmct32_raster_demo_top.sv)
|
||
still passes — DE forwards through the overlay regardless of
|
||
enable, and the sim path tie-off ensures `bridge_osd_ctrl_enable=0`
|
||
so the overlay stays disabled (matching the production default).
|
||
|
||
Updated boundary call:
|
||
|
||
> **retrodesd can now turn the overlay on/off through the real
|
||
> ABI bit. Ch222–Ch230 collectively gave: full ABI v1.0 register
|
||
> decode (Ch222–Ch227), diagnostic read surface (Ch225), DS2
|
||
> stub (Ch226), tile RAM storage (Ch227), video-domain overlay
|
||
> engine skeleton (Ch228), HPS-tile-RAM-on-video close-the-loop
|
||
> (Ch229), and the ABI-bit-driven overlay gate + diagnostic
|
||
> counter (Ch230). The architecture for "supervisor menu over
|
||
> PS2 video" is now complete. Remaining work is content
|
||
> (Ch231+): glyph rendering from real tile data, plus
|
||
> optional design→bridge reverse CDC for HPS-readable diagnostic
|
||
> counter exposure.**
|
||
|
||
### Ch231 — minimal real glyph rendering (landed)
|
||
|
||
The Ch228–Ch230 visual was "8×8 checker block where the tile
|
||
word is non-zero" — useful for proving the path but not real
|
||
text. Ch231 swaps the checker for an actual glyph renderer:
|
||
each tile entry is now interpreted as a 16-bit text cell with
|
||
char code + foreground + background color, decoded against a
|
||
built-in 8×8 font ROM, and composited with a 16-color CGA
|
||
palette. retrodesd writes the same bytes sibling cores already
|
||
expect, and the visible result is legible text.
|
||
|
||
#### Cell layout (sibling-ABI compatible)
|
||
|
||
The Ch229 PS2-local "one 32-bit cell per tile word" view is
|
||
replaced by the shared-ABI `input_common.h` layout:
|
||
**2048 × 16-bit cells, packed 2/word, 64-cell row stride**.
|
||
|
||
```
|
||
Byte address = 0x1000 + row * 128 + col * 2
|
||
Word index = (row * 32) + (col >> 1) ← addr[11:2]
|
||
Cell-within-word = col[0] (0 = low 16 bits, 1 = high 16 bits)
|
||
```
|
||
|
||
The overlay's `tile_rd_index` now computes
|
||
`{1'b0, tile_row[3:0], tile_col[5:1]}` (= row*32 + col/2);
|
||
the read returns a 32-bit word from `shadow_mem`; the renderer
|
||
selects high or low half based on `tile_col[0]`. The bridge
|
||
side is unchanged — the AXI write decode still stores at
|
||
`tile_mem[addr[11:2]]` and the Ch229 broadcast still emits
|
||
`{toggle, addr[11:2], wdata_lane}`. **HPS-side software now
|
||
writes the same byte offsets sibling cores use** — same
|
||
`osd_putchar` helper, same `osd_draw_*` calls in
|
||
`input_common.h`.
|
||
|
||
| Cell bit field | Width | Meaning |
|
||
|----------------|-------|-----------------------------------------------|
|
||
| `[7:0]` | 8 | ASCII char code (font ROM index) |
|
||
| `[11:8]` | 4 | Foreground color (16-color palette index) |
|
||
| `[15:12]` | 4 | Background color (16-color palette index) |
|
||
|
||
A cell value of `16'd0` is treated as **transparent** (PS2
|
||
video passes through), regardless of how those zero bits would
|
||
otherwise resolve in the palette — this matches the Ch229
|
||
"zero = transparent" contract that retrodesd-side code already
|
||
depends on.
|
||
|
||
#### Font ROM
|
||
|
||
256 × 8 × 8 (= 16 KiB) of glyph data, indexed by char code,
|
||
row within cell, and column within cell. Stored as
|
||
`logic [7:0] font_rom [0:255][0:7]` and zero-filled in an
|
||
`initial` block; specific glyphs are then populated. The
|
||
Ch231 subset:
|
||
|
||
- Space `0x20` (zero-filled — renders as solid bg)
|
||
- Digits `0`–`9` (`0x30..0x39`)
|
||
- A subset of uppercase: `A`, `B`, `C`, `O`
|
||
- Punctuation: `.`, `:`, `-`, `/`
|
||
|
||
Other ASCII codes resolve to all-zero rows → the cell shows
|
||
solid `bg` color, which is the right "missing glyph" fallback.
|
||
Adding more letters is a mechanical edit to the `initial` block.
|
||
|
||
MSB-left convention: bit 7 of each glyph byte is the **leftmost**
|
||
pixel of that row. Glyphs are nominal 5×7 inside the 8×8 cell
|
||
with 1-pixel margin on the right and bottom — same alignment
|
||
the sibling-ABI font ROMs use.
|
||
|
||
#### 16-color palette
|
||
|
||
CGA-style, matching the `BLACK..WHITE` constants in
|
||
`input_common.h:346-361`. Implemented as a combinational
|
||
`palette_lookup(idx)` function returning a 24-bit RGB triple.
|
||
Fixed at synthesis time; a runtime-configurable palette is
|
||
post-MVP polish.
|
||
|
||
| idx | name | RGB | idx | name | RGB |
|
||
|-----|-----------|-------------|-----|-------------|-------------|
|
||
| 0 | black | `00_00_00` | 8 | dgray | `55_55_55` |
|
||
| 1 | blue | `00_00_AA` | 9 | lblue | `55_55_FF` |
|
||
| 2 | green | `00_AA_00` | 10 | lgreen | `55_FF_55` |
|
||
| 3 | cyan | `00_AA_AA` | 11 | lcyan | `55_FF_FF` |
|
||
| 4 | red | `AA_00_00` | 12 | lred | `FF_55_55` |
|
||
| 5 | magenta | `AA_00_AA` | 13 | lmagenta | `FF_55_FF` |
|
||
| 6 | brown | `AA_55_00` | 14 | yellow | `FF_FF_55` |
|
||
| 7 | lgray | `AA_AA_AA` | 15 | white | `FF_FF_FF` |
|
||
|
||
#### Pixel decision (rewritten)
|
||
|
||
```sv
|
||
draw_osd = in_de && in_osd_region && cell_opaque;
|
||
pixel_on = font_rom[char_code][y_cnt[2:0]][3'd7 - x_cnt[2:0]];
|
||
osd_rgb = pixel_on ? palette_lookup(fg_idx) : palette_lookup(bg_idx);
|
||
out_{r,g,b} = draw_osd ? osd_rgb : in_{r,g,b};
|
||
```
|
||
|
||
The Ch229 checker fallback (`x[3] ^ y[3]`) is **removed
|
||
entirely**. Per Codex's framing — "keep checker fallback out of
|
||
production path or behind a debug parameter" — Ch231 chose
|
||
*remove*. The Ch228 standalone-overlay TB and the Ch229
|
||
HPS-driven TB both have full coverage of the glyph path; the
|
||
checker is no longer needed for visual confirmation.
|
||
|
||
#### Coverage
|
||
|
||
`tb_osd_overlay_stub.sv` rewritten for Ch231:
|
||
|
||
| § | Check |
|
||
|----|------------------------------------------------------------------------------------------------------------------------------|
|
||
| 1 | Sync passthrough (DE/HS/VS forwarded every cycle, unchanged) |
|
||
| 2 | Disabled DUT passthrough — PS2 video at 4 sample points |
|
||
| 3 | All-zero shadow RAM → enabled DUT still shows PS2 video (Ch229 "zero = transparent" preserved) |
|
||
| 4 | Cell (0, 0) = `' '` on white/blue → entire cell shows solid blue (space glyph is all-zero, cell is still opaque) |
|
||
| 5 | Cell (1, 0) = `'0'` on white/blue → specific pixels match the '0' glyph mask (rows 0–1 verified at multiple x positions) |
|
||
| 6 | Adjacent cells (low half vs high half of same word) don't bleed |
|
||
| 7 | Zero cell (5, 2) → transparent |
|
||
| 8 | Cell (19, 5) = `'1'` red/black at the bottom-right — verifies the packed-layout indexing at the visible-region boundary |
|
||
| 9 | Region boundary at x=160 / y=48 stays transparent regardless of cell content |
|
||
| 10 | Clear cell (0, 0) → block goes back to PS2; cell (1, 0) glyph remains visible |
|
||
| 11 | Frame-boundary re-sync — same glyph still rendered at `(10, 0)` in the next frame |
|
||
| 12 | Disabled DUT ignores cell content |
|
||
| 13 | Ch230 runtime enable: `enable_i=0` → PS2 even with glyphs; `enable_i=1` → glyph rendered; back to PS2 when cleared |
|
||
|
||
`tb_tile_ram_cdc.sv` is unchanged — CDC propagation semantics
|
||
don't care about how the receiver interprets the bytes, so the
|
||
Ch229 round-trip tests still validate the path with whatever
|
||
synthetic patterns the TB drives.
|
||
|
||
Top-level `tb_de25_nano_psmct32_raster_demo_top.sv` still
|
||
passes: it only checks `VIDEO_DE` (which the overlay forwards
|
||
unchanged), and the sim-path tie-off keeps the overlay
|
||
disabled by default.
|
||
|
||
#### What's deferred to Ch232+
|
||
|
||
Per Codex's framing, Ch231 explicitly excludes:
|
||
|
||
- Proportional fonts / variable-width glyphs.
|
||
- Full CP437 / Unicode glyph set.
|
||
- Alpha blending (current cells are fully opaque on nonzero).
|
||
- Cursor overlay (`OSD_STATUS[12:8]` cursor_row is wired but
|
||
not consumed by the renderer).
|
||
- Tile scrolling / window panning.
|
||
- HPS-readable `tile_wr_too_close_count` (reverse-CDC
|
||
exposure from design_clk → CLOCK2_50 → AXI register).
|
||
|
||
Updated boundary call:
|
||
|
||
> **The OSD can now draw legible text over PS2 video. Cell
|
||
> writes from HPS land in the shadow RAM through Ch229's
|
||
> CDC, get decoded as `{bg, fg, char}` triples, and are
|
||
> rasterized via a 256-glyph 8×8 font ROM + 16-color CGA
|
||
> palette. Ch231 ships a small but functional glyph subset;
|
||
> populating the rest of the font is a mechanical edit.
|
||
> Remaining work is polish: extended glyph set, cursor,
|
||
> reverse-CDC counter exposure, optional alpha — none of
|
||
> which are blockers for retrodesd to render an actual
|
||
> supervisor menu over PS2 video.**
|
||
|
||
### Ch232 — hardware bring-up validation (OSD over PS2 video)
|
||
|
||
Ch222–Ch231 built the OSD path entirely in sim. Ch232 validates
|
||
it on the DE25-Nano: write a known glyph sequence into the
|
||
Ch229 shadow RAM, assert `OSD_CTRL[0]`, and confirm the text
|
||
appears in the top-left of the HDMI output, layered over the
|
||
Ch171 quadrant test card.
|
||
|
||
#### Test helper:
|
||
[`ps2_osd_test.sh`](ps2_osd_test.sh)
|
||
|
||
Writes the test message `"01234 ABC"` (9 chars at cells
|
||
(0,0)..(8,0), white on blue) into the bridge tile RAM and
|
||
asserts `OSD_CTRL[0]=1`. The chars are chosen from the
|
||
Ch231-populated font subset (digits + space + A/B/C) so the
|
||
test exercises **real glyph rendering** rather than the
|
||
"missing glyph → solid bg" fallback. Run modes:
|
||
|
||
| Invocation | Effect |
|
||
|------------------------------------|-------------------------------------------------------|
|
||
| `./ps2_osd_test.sh` | Writes the 9 cells + enables the overlay |
|
||
| `./ps2_osd_test.sh --off` | Disables the overlay (`OSD_CTRL[0]=0`); message stays |
|
||
| `./ps2_osd_test.sh --clear` | Zeros the 9 cells + leaves the overlay enabled |
|
||
| `./ps2_osd_test.sh --status` | Dumps `OSD_CTRL` + the first 20 tile-RAM words |
|
||
|
||
Uses `busybox devmem` per the long-standing devmem2 quirk note.
|
||
Environment overrides: `PS2_BRIDGE_BASE` (default `0x40000000`)
|
||
and `DEVMEM` (default `busybox devmem`).
|
||
|
||
#### Procedure
|
||
|
||
1. **Build** the .core.rbf via the normal flow (Quartus GUI or
|
||
`quartus_sh --flow compile`). Output:
|
||
`synth/de25_nano/top_psmct32_raster_demo/output_files/*.core.rbf`.
|
||
2. **Copy** the RBF + the two helper scripts to the board:
|
||
```bash
|
||
scp output_files/*.core.rbf terasic@de25:/home/terasic/cores/retroDE_ps2.core.rbf
|
||
scp docs/hardware/ps2_status.sh terasic@de25:/home/terasic/
|
||
scp docs/hardware/ps2_osd_test.sh terasic@de25:/home/terasic/
|
||
ssh terasic@de25 chmod +x /home/terasic/ps2_*.sh
|
||
```
|
||
3. **Load** the core (on the board):
|
||
```bash
|
||
./core_loader.sh load /home/terasic/cores/retroDE_ps2.core.rbf
|
||
```
|
||
4. **Pre-flight** with the Ch219 status block — confirms identity,
|
||
HDMI lock, and counter health BEFORE touching the OSD:
|
||
```bash
|
||
./ps2_status.sh --delta
|
||
```
|
||
Expect: `CORE_ID=0x50533200`, `ABI=0x100`, `[0..4]=1`, `[5]=0`
|
||
(no I²C error), `[6]=0` (no raster overflow), `FRAME_COUNT
|
||
Δ ≈ 30`, `RASTER_OVERFLOW Δ = 0`.
|
||
5. **Visual** — look at the HDMI monitor: 320×240 quadrant card
|
||
in the upper-left (RED top-left, GREEN top-right, BLUE
|
||
bottom-left, WHITE bottom-right). No OSD yet.
|
||
6. **Write the OSD test message + enable** the overlay:
|
||
```bash
|
||
./ps2_osd_test.sh
|
||
```
|
||
7. **Observe**: the top-left 160×48 strip of the HDMI image now
|
||
shows white-on-blue `"01234 ABC"` text. The text overlays
|
||
the RED quadrant of the test card (which extends from
|
||
x∈[0,160] in the test card's local space, so the OSD lands
|
||
inside the red region with text on top).
|
||
8. **Toggle off → on**:
|
||
```bash
|
||
./ps2_osd_test.sh --off # text vanishes, RED quadrant fully visible
|
||
./ps2_osd_test.sh # text reappears
|
||
```
|
||
9. **Diagnostics still healthy** after the OSD writes:
|
||
```bash
|
||
./ps2_status.sh --delta
|
||
```
|
||
Expect the same numbers as in step 4. **No new
|
||
`hdmi_i2c_error` or `raster_overflow`.**
|
||
|
||
#### Expected screen (ASCII sketch)
|
||
|
||
After step 6, the HDMI image (640×480 visible area) should
|
||
look approximately like:
|
||
|
||
```
|
||
0 160 320 640
|
||
0 ┌────────┬───────┬─────────────────┐
|
||
│WWWWWWWW│ │ │ ← W = white text on
|
||
│WW WWWW │ │ │ blue bg (OSD)
|
||
│WW WWWW │ RED │ │ overlays the
|
||
│WW WWWW │ │ │ RED quadrant
|
||
│WW WWWW │ │ │
|
||
48 │WWWWWWWW├───────┤ │
|
||
│ │ │ │
|
||
│ RED │ GREEN │ │
|
||
│ │ │ │
|
||
│ │ │ │
|
||
240 ├────────┼───────┤ │
|
||
│ │ │ │
|
||
│ BLUE │ WHITE │ (black, │
|
||
│ │ │ unchanged) │
|
||
│ │ │ │
|
||
480 └────────┴───────┴─────────────────┘
|
||
```
|
||
|
||
The OSD strip (top-left 160×48) is approximately the upper-half
|
||
of the RED quadrant. Letters are 8 pixels wide so the 9-char
|
||
message takes 72 of the 160 pixels; the remaining 88 pixels on
|
||
the right are also blue (cells 9..19 are zero → transparent →
|
||
PS2 RED shows through).
|
||
|
||
Wait, cells 9..19 are unwritten (`16'd0` → transparent). So the
|
||
right portion of the OSD strip is actually **PS2 RED** (the test
|
||
card), not blue. The blue region is only the 9×8 = 72-pixel-wide
|
||
text strip in the corner. That matches the Ch229 transparency
|
||
contract.
|
||
|
||
A more accurate sketch of just the top-left:
|
||
|
||
```
|
||
0 72 160 320
|
||
0 ┌─────┬──┬────────────┐
|
||
│OSD │ │ │ 72px-wide white-on-blue
|
||
│text │RR│ RED │ "01234 ABC" + remainder
|
||
48 ├─────┴──┤ │ of red quadrant
|
||
│ │ │
|
||
│ RED │ │
|
||
240 ├────────┴────────────┤
|
||
```
|
||
|
||
#### Acceptance criteria
|
||
|
||
| # | Check | How to verify |
|
||
|----|--------------------------------------------------------------------------------------|-------------------------------------------------------|
|
||
| 1 | RBF builds without timing failures | `quartus_sh --flow compile` exits 0 |
|
||
| 2 | Core loads, SSH survives | `core_loader.sh load` returns; SSH prompt responsive |
|
||
| 3 | Identity matches manifest | `ps2_status.sh` shows `CORE_ID=0x50533200` |
|
||
| 4 | Quadrant card visible after load | Visual |
|
||
| 5 | OSD text appears after `./ps2_osd_test.sh` | Visual: white "01234 ABC" in top-left |
|
||
| 6 | OSD does not occlude the rest of the quadrant card | Visual: GREEN/BLUE/WHITE quadrants unchanged |
|
||
| 7 | `OSD_CTRL[0]` toggle works | `./ps2_osd_test.sh --off` then `./ps2_osd_test.sh` |
|
||
| 8 | `FRAME_COUNT` still ticks at ~60 Hz | `./ps2_status.sh --delta` shows Δ ≈ 30 |
|
||
| 9 | `RASTER_OVERFLOW_COUNT` stays at 0 | `./ps2_status.sh --delta` |
|
||
| 10 | `hdmi_i2c_error` stays clear | `CORE_STATUS[5] = 0` |
|
||
| 11 | OSD text is **legible** (specific glyphs visible, not blocks) | Visual: digits 0–4 + A/B/C distinguishable |
|
||
| 12 | Tile-write CDC counter stays at 0 (no rate violations from operator-pace HPS writes) | Future — needs the Ch232+ reverse-CDC exposure work |
|
||
|
||
All 11 pre-Ch232+ checks should pass on the first try.
|
||
|
||
#### Observed behavior (silicon-validated 2026-05-20)
|
||
|
||
First bench run hit two Quartus-side issues that the
|
||
sim-only Ch222–Ch231 work didn't surface, then validated
|
||
cleanly on the third compile attempt:
|
||
|
||
1. **Missing QSF entries.** `tile_ram_cdc.sv` (Ch229) and
|
||
`osd_overlay_stub.sv` (Ch228) were registered in the sim
|
||
Makefile's `RTL_SRCS` but never added to the Quartus
|
||
project's `.qsf`. Elaboration failed with "undefined
|
||
entity". Fixed by adding `set_global_assignment -name
|
||
SYSTEMVERILOG_FILE` lines for both files to
|
||
[`synth/de25_nano/top_psmct32_raster_demo/de25_nano_psmct32_raster_demo_top.qsf`](../../synth/de25_nano/top_psmct32_raster_demo/de25_nano_psmct32_raster_demo_top.qsf).
|
||
2. **BRAM inference failure → LAB budget blow-out.** First
|
||
successful elaboration hit the Fitter with "requires 6491
|
||
LABs / device has 4680". The 2D-unpacked font ROM
|
||
(`logic [7:0] font_rom [0:255][0:7]`) didn't infer M20K
|
||
and spilled ~2000 LABs of distributed logic. Two-stage
|
||
fix: (a) flatten to 1D `logic [7:0] font_rom [0:2047]`
|
||
indexed by `{char_code, row_in_cell}` — matches the
|
||
shape of `tile_mem`, generally inferable; (b) add
|
||
explicit `(* romstyle = "M20K" *)` attribute on
|
||
`font_rom`, plus `(* ramstyle = "M20K" *)` on `tile_mem`
|
||
and `tile_ram_cdc.shadow_mem` to defensively force the
|
||
inferencer's hand on all three memories.
|
||
3. **Compile success at attempt 3.** Quartus reports:
|
||
- ALMs: 32,822 / 46,800 (70 %)
|
||
- RAM Blocks: 260 / 358 (73 %)
|
||
- Block memory bits: 4,259,840 / 7,331,840 (58 %)
|
||
- DSP: 6 / 376 (2 %), HSSI HPS: 1 / 1 (100 %), PLLs: 2 / 11 (18 %)
|
||
- Timing models: Final
|
||
4. **Operator-checklist results (all 11 boxes ✓):**
|
||
- RBF builds cleanly (third attempt, after fixes above)
|
||
- Core loads via `core_loader.sh`, SSH session survives
|
||
- Identity matches manifest (CORE_ID = `0x50533200`,
|
||
ABI = `0x00000100`)
|
||
- Quadrant card visible after load
|
||
- `./ps2_osd_test.sh` brings up the OSD text
|
||
- Quadrant card is otherwise unchanged (no occlusion
|
||
bleed outside the 9-character strip)
|
||
- `OSD_CTRL[0]` toggle works — `--off` removes the OSD,
|
||
re-invoking re-displays it
|
||
- `FRAME_COUNT` Δ = 31 over 500 ms (expected ≈ 30; 3 %
|
||
variance is well within the sample-window margin)
|
||
- `RASTER_OVERFLOW_COUNT` Δ = 0 across both snapshots
|
||
- `hdmi_i2c_error` stays clear (`CORE_STATUS[5] = 0`,
|
||
LED[4] dark) throughout
|
||
- All glyphs in the `01234 ABC` test message are
|
||
legible — not missing-glyph blocks
|
||
|
||
The Ch232 acceptance criteria 1–11 from the table above are
|
||
all confirmed on real silicon. Item 12 (HPS-readable
|
||
tile-write CDC counter) remains deferred to a future chapter
|
||
since no rate-violation events occurred on the bench
|
||
(retrodesd-rate writes are millions of dclk cycles apart).
|
||
|
||
#### Boundary call (post-validation)
|
||
|
||
> **Ch222–Ch232 close the "supervisor menu over PS2 video"
|
||
> arc.** The full path — HPS userspace writes through the
|
||
> bridge's tile RAM + OSD_CTRL register → toggle-CDC and 3-FF
|
||
> sync into design_clk → shadow RAM + synced enable → 16-bit
|
||
> cell decode + 8×8 glyph render + 16-color palette →
|
||
> composite over PS2 video → HDMI — is now silicon-validated
|
||
> on the Agilex 5. retrodesd writes the same bytes it writes
|
||
> to sibling cores and a supervisor menu can render over PS2
|
||
> video. Remaining work is purely content polish (full glyph
|
||
> set, cursor, palette runtime, reverse-CDC counter
|
||
> exposure) — none of which are blockers.
|
||
|
||
### Ch236 — input-path hardware truth + operator visibility
|
||
|
||
Ch222 / Ch226 / Ch234 / Ch235 collectively built the HPS → bridge
|
||
→ IOP-fabric input pipeline. Ch232 silicon-validated the OSD path,
|
||
**but the input path's silicon coverage stops at the bridge** —
|
||
the synth top has no IOP core, so the bridge's `input_p1_o` /
|
||
`input_p2_o` outputs land at unconnected nets in
|
||
[`de25_nano_psmct32_raster_demo_top.sv`](../../rtl/top/de25_nano_psmct32_raster_demo_top.sv).
|
||
Ch236 makes this explicit so nobody overreads the milestone, and
|
||
extends `ps2_status.sh` so an operator can verify the
|
||
bridge-half of the path on real hardware.
|
||
|
||
#### What works on silicon today (HPS-side half)
|
||
|
||
- `ps2_hps_bridge.INPUT_P1` / `INPUT_P2` / `INPUT_P1_RAW`
|
||
latches at 0x040 / 0x044 / 0x048 — Ch222, silicon-validated
|
||
on the DE25-Nano. retrodesd's `input_thread.c` writes here;
|
||
any process with `/dev/mem` can read or write.
|
||
- `ps2_hps_bridge.DS2_STATUS` / `DS2_BUTTONS` at 0x0F0 / 0x0F4
|
||
— Ch226. DS2_BUTTONS is a real-time mirror of INPUT_P1
|
||
(the same register the sibling DS2 poll thread reads).
|
||
- Ch232's silicon checklist covered Ch222 indirectly through
|
||
the bridge identity readbacks; Ch236 adds *direct* readback
|
||
of the input latches to `ps2_status.sh`.
|
||
|
||
#### What does NOT yet work on silicon (PS2-fabric-side half)
|
||
|
||
- `iop_memory_map_stub` + `sio2_input_stub` (Ch234) are
|
||
**sim-only** for now. The synth top
|
||
([`de25_nano_psmct32_raster_demo_top.sv`](../../rtl/top/de25_nano_psmct32_raster_demo_top.sv))
|
||
doesn't instantiate the IOP core, so there is no PS2-fabric
|
||
consumer of the bridge's `input_p1_o` / `input_p2_o` outputs.
|
||
The wires land at unconnected nets named
|
||
`bridge_input_p1` / `bridge_input_p2`; Quartus elides them
|
||
during synthesis as dead logic.
|
||
- `tb_bridge_iop_pad_input` proves the wiring + CDC + Sony
|
||
format translation works correctly *in simulation* with
|
||
both bridge and IOP map instantiated together. That gives
|
||
high confidence that **when the IOP core lands on the
|
||
synth top in a future chapter, the bridge → IOP path will
|
||
light up without further RTL work** — only `.input_p1` /
|
||
`.input_p2` need to be connected at the integration site.
|
||
|
||
#### `ps2_status.sh` extension
|
||
|
||
The Ch219 status block now includes:
|
||
|
||
```
|
||
Input latches (Ch222) + DS2 mirror (Ch226):
|
||
INPUT_P1 : 0x00000000
|
||
INPUT_P2 : 0x00000000
|
||
INPUT_P1_RAW : 0x00000000
|
||
DS2_STATUS : 0x00000004 (Ch226 stub: [0]=connected=0,
|
||
[2]=input_latches_valid=1)
|
||
DS2_BUTTONS : 0x00000000 ✓ tracks INPUT_P1
|
||
→ PS2-fabric consumer (Ch234 sio2_input_stub) is sim-only at the moment;
|
||
non-zero values above mean the bridge latch landed, NOT that PS2 code saw it.
|
||
```
|
||
|
||
The footer explicitly disclaims silicon consumption — non-zero
|
||
values prove the bridge latch landed, **not** that PS2-fabric
|
||
code observed them.
|
||
|
||
#### Operator test (input-path silicon smoke)
|
||
|
||
After `core_loader.sh load` brings the ps2 fabric up:
|
||
|
||
1. Confirm baseline — all input latches read 0:
|
||
```bash
|
||
./ps2_status.sh
|
||
# Expect: INPUT_P1 = INPUT_P2 = INPUT_P1_RAW = DS2_BUTTONS = 0x0
|
||
```
|
||
|
||
2. Synthetic AXI write — set INPUT_P1 to a known pattern
|
||
(e.g. `JOY_START | JOY_SELECT | JOY_RIGHT` =
|
||
`(1<<4) | (1<<5) | (1<<0)` = `0x31`):
|
||
```bash
|
||
sudo busybox devmem 0x40000040 w 0x00000031
|
||
```
|
||
|
||
3. Re-read — both the direct latch AND the DS2 mirror reflect
|
||
the write:
|
||
```bash
|
||
./ps2_status.sh
|
||
# Expect: INPUT_P1 = 0x00000031, DS2_BUTTONS = 0x00000031 (✓ tracks)
|
||
```
|
||
|
||
4. Independent P2 write:
|
||
```bash
|
||
sudo busybox devmem 0x40000044 w 0x000003C0
|
||
./ps2_status.sh
|
||
# Expect: INPUT_P2 = 0x000003C0, INPUT_P1 unchanged
|
||
```
|
||
|
||
5. Clear:
|
||
```bash
|
||
sudo busybox devmem 0x40000040 w 0x00000000
|
||
sudo busybox devmem 0x40000044 w 0x00000000
|
||
```
|
||
|
||
6. (Optional) live retrodesd test — `retrodesd.service` running
|
||
with the ps2 manifest stanza will write button bitmaps
|
||
continuously from the connected gamepad. Press a button on
|
||
the controller and re-run `./ps2_status.sh` — INPUT_P1
|
||
should briefly show the corresponding `JOY_*` bit set
|
||
(active-high SNES-style). The supervisor OSD path doesn't
|
||
render anything new (Ch222 / Ch226 are pure bridge regs;
|
||
the OSD path is independent), but the bridge latch is alive.
|
||
|
||
#### Acceptance criteria
|
||
|
||
| # | Check | How to verify |
|
||
|----|--------------------------------------------------------------------------------------|---------------------------------------------------------------------|
|
||
| 1 | `ps2_status.sh` shows the new input section | Visual: "Input latches (Ch222) + DS2 mirror (Ch226):" block present |
|
||
| 2 | All five fields read 0 / 0x4 at baseline | `INPUT_P*` = 0, `DS2_STATUS` = 0x4, `DS2_BUTTONS` = 0 |
|
||
| 3 | `devmem` write to 0x40000040 round-trips | Re-read shows the written value |
|
||
| 4 | DS2_BUTTONS mirrors INPUT_P1 in real time | The `✓ tracks INPUT_P1` marker appears |
|
||
| 5 | INPUT_P2 write is independent of INPUT_P1 | Step 4 doesn't disturb INPUT_P1 |
|
||
| 6 | The footer disclaimer is visible | "PS2-fabric consumer (Ch234 sio2_input_stub) is sim-only…" |
|
||
|
||
All six are operator-visible on the existing Ch232 build — no
|
||
new RBF or fabric change is needed. The bridge's input
|
||
register surface was already on silicon as of Ch222 / Ch226;
|
||
Ch236 only adds the diagnostic visibility.
|
||
|
||
#### Updated boundary call
|
||
|
||
> **Ch222 / Ch226 input-latch surface is silicon-confirmed
|
||
> via the Ch236 `ps2_status.sh` extension; HPS↔bridge half of
|
||
> the input path is alive on real hardware. The bridge → IOP
|
||
> half (Ch234/Ch235) remains sim-only until a future chapter
|
||
> instantiates an IOP core on the synth top. That's the next
|
||
> meaningful RTL bridge to cross for the input arc; in the
|
||
> meantime, the next code chapter could be SIF/libpad-buffer
|
||
> stub reconnaissance to define the EE-visible pad-state
|
||
> path before the IOP-core integration lands.**
|
||
|
||
### Ch241 — input-arc hardware truth after Ch240
|
||
|
||
The Ch237–Ch240 work closed the input arc in **simulation**:
|
||
HPS → bridge latches → IOP-side `sio2_input_stub` → SIF DMA
|
||
into a fixed EE-RAM buffer → EE-CPU code branches on a button
|
||
bit. The full path is covered by
|
||
[`tb_ee_pad_buffer_branch.sv`](../../sim/tb/integration/tb_ee_pad_buffer_branch.sv)
|
||
across four scenarios including a clear-and-restore case
|
||
(Ch240's §4). Cross-references:
|
||
[`docs/contracts/sio2_pad.md`](../contracts/sio2_pad.md)
|
||
sections Ch234 / Ch238 / Ch239 / Ch240.
|
||
|
||
#### What's silicon today
|
||
|
||
Same as Ch236 — only the **HPS↔bridge half**:
|
||
|
||
- `INPUT_P1` / `INPUT_P2` / `INPUT_P1_RAW` latches at
|
||
0x040 / 0x044 / 0x048 land in the bridge on every retrodesd
|
||
write, observable via `ps2_status.sh`.
|
||
- `DS2_STATUS` / `DS2_BUTTONS` mirror at 0x0F0 / 0x0F4 tracks
|
||
`INPUT_P1` in real time (Ch226).
|
||
- The bridge's new `input_p1_o` / `input_p2_o` outputs (Ch235)
|
||
exist on silicon and drive `bridge_input_p1` /
|
||
`bridge_input_p2` wires in the board top —
|
||
**but those wires terminate at unconnected nets**, because
|
||
the synth top doesn't instantiate the IOP core that would
|
||
consume them. Quartus elides them as dead logic.
|
||
|
||
#### What's NOT silicon
|
||
|
||
- `iop_memory_map_stub` (Ch234 `sio2_input_stub` inside) — sim-only.
|
||
- `sif_dma_ee_ram_bridge_stub` driven by IOP-side software —
|
||
sim-only.
|
||
- `ee_memory_map_stub` / `ee_ram_stub` / `bios_rom_stub` /
|
||
`ee_core_stub` — sim-only.
|
||
- The Ch239 single-slot `EE_PAD_BUFFER_BASE` buffer is a
|
||
sim-only RAM location; on real hardware there is no
|
||
corresponding EE-RAM-resident pad packet because there is
|
||
no producer or consumer in the fabric.
|
||
|
||
#### Why no new `ps2_status.sh` work this chapter
|
||
|
||
The operator-visible HPS-side surface (`ps2_status.sh`) already
|
||
shows `INPUT_P1` / `INPUT_P2` / `INPUT_P1_RAW` / `DS2_*`
|
||
via Ch236. The next data points (PAD_P1_STATE, the fixed pad
|
||
buffer, the EE-program marker) are **not addressable from
|
||
HPS userspace on the current hardware top** — they live behind
|
||
modules that aren't instantiated. Adding read paths to
|
||
`ps2_status.sh` for those would print zeros or garbage and
|
||
mislead the operator. Better to wait until the IOP/EE chain
|
||
lands in a top-level chapter.
|
||
|
||
#### What a "full input path on silicon" chapter looks like
|
||
|
||
A future hardware-integration chapter (likely **Ch300+** given
|
||
its scope) would need to:
|
||
|
||
1. Instantiate the IOP map + `sio2_input_stub` in the synth
|
||
top, fed by the existing `bridge_input_p1` / `bridge_input_p2`
|
||
wires.
|
||
2. Instantiate an IOP execution primitive that drives the
|
||
producer side — either `iop_core_stub` running a small
|
||
pad-packing program from BIOS, or a TB-style FSM that
|
||
composes the existing primitives in hardware.
|
||
3. Instantiate the SIF egress bridge (with rewind) writing
|
||
to an EE-side RAM block on the FPGA fabric.
|
||
4. Either (a) instantiate `ee_core_stub` + bios_rom + RAM and
|
||
run a small EE program that consumes the buffer, or
|
||
(b) skip EE-side consumption and just expose the
|
||
sif_dma_ee_ram_bridge's `last_seen_o` + the buffer contents
|
||
to HPS via a new bridge read port for operator inspection.
|
||
|
||
That's a substantial architecture decision — three RTL modules
|
||
land on the synth top, plus the IOP/EE program ROM contents
|
||
need a build-time pipeline. Defers naturally until either a
|
||
real game/BIOS workflow demands it or a future chapter
|
||
explicitly chooses to ship the input path on silicon as a
|
||
demo.
|
||
|
||
#### Updated boundary call
|
||
|
||
> **Input arc is closed in simulation; HPS↔bridge half is
|
||
> closed on silicon. Closing the bridge→IOP→SIF→EE half on
|
||
> silicon is a multi-module top-integration chapter (Ch300+
|
||
> bracket), not a follow-on of Ch240. Until then,
|
||
> `ps2_status.sh` keeps reporting only the bridge-side
|
||
> latches that are physically addressable from HPS, and the
|
||
> "Ch234 sio2_input_stub is sim-only" disclaimer the script
|
||
> already prints stays in effect.**
|
||
|
||
### Ch242 — OSD glyph coverage for retrodesd menu text (landed)
|
||
|
||
Ch231 brought up real glyph rendering with a deliberately tiny
|
||
seed font (digits `0-9`, four uppercase letters `A B C O`, space,
|
||
and a handful of punctuation: `: . - /`). That set was enough to
|
||
prove the renderer wiring but not enough for any real menu text
|
||
out of `retroDE_splash` to be readable — every other character
|
||
would draw as the implicit zero-fill glyph (blank), producing
|
||
boxes-with-holes-where-letters-should-be.
|
||
|
||
Ch242 expands the font ROM in
|
||
[`rtl/platform/osd_overlay_stub.sv`](../../rtl/platform/osd_overlay_stub.sv)
|
||
to cover the union of characters actually used by retrodesd's
|
||
common OSD strings, keeping the existing 5×7-in-8×8 retro style.
|
||
|
||
#### Character set added
|
||
|
||
Audit of `retroDE_splash/software/*.c` for menu/UI strings
|
||
("Core Select…", "Load Cart…", "RetroDE Input Test", "Save
|
||
States", "P1:", "P2:", etc.) produced a unique-character set
|
||
that, minus what was already in the seed font, breaks down as:
|
||
|
||
| Category | Characters added |
|
||
|---|---|
|
||
| Punctuation (7) | `! ( ) + , < >` |
|
||
| Uppercase letters (22) | `D E F G H I J K L M N P Q R S T U V W X Y Z` |
|
||
| Lowercase letters (26) | `a b c d e f g h i j k l m n o p q r s t u v w x y z` |
|
||
|
||
Pre-existing (unchanged): `0-9`, `A B C O`, space, `: . - /`.
|
||
|
||
The 5×7 patterns use the same MSB-left byte-per-row convention
|
||
as Ch231. Lowercase glyphs put x-height body in rows 2–6 and
|
||
reserve row 7 for descenders (`g j p q y`) and rows 0–1 for
|
||
ascenders (`b d f h k l t`). Cell format
|
||
(`{bg[3:0], fg[3:0], char[7:0]}`) is unchanged, palette is
|
||
unchanged, render path is unchanged — only the font ROM
|
||
contents grew.
|
||
|
||
#### TB coverage
|
||
|
||
[`sim/tb/platform/tb_osd_overlay_stub.sv`](../../sim/tb/platform/tb_osd_overlay_stub.sv)
|
||
gains a §14 "Ch242 — representative-string render" section
|
||
that paints `"Core"` into cells (0..3, 0) with white-on-blue
|
||
and samples a small set of pixels per glyph to prove:
|
||
|
||
- `C` (0x43, pre-existing) draws its row-0 pattern `00111100`
|
||
correctly — cols 0 = bg, 2 = fg, 5 = fg.
|
||
- `o` (0x6F, **new**) row 0 is all-zero (lowercase x-height
|
||
starts at row 2), row 2 is `00111000` — proving the new
|
||
glyph rows are read at the right ROM index.
|
||
- `r` (0x72, **new**) row 2 is `01011000` — col 1 fg,
|
||
col 0 bg distinguishes `r` from `o`.
|
||
- `e` (0x65, **new**) row 2 is `00111000`, row 4 is the
|
||
crossbar `01111100` — verifies multiple rows of the same
|
||
new glyph.
|
||
- An unwritten cell `(4, 0)` still shows PS2 video — the
|
||
Ch229 transparency contract is unaffected by the larger
|
||
font ROM.
|
||
|
||
Per Codex's Ch242 framing the TB checks a representative
|
||
string, not every pixel of every glyph; per-pixel correctness
|
||
of every newly-added letter is a property of the font ROM
|
||
contents themselves and is reviewable in
|
||
`osd_overlay_stub.sv` directly.
|
||
|
||
#### Expected operator-visible behavior
|
||
|
||
When retrodesd next renders any menu text through the OSD
|
||
canvas, characters previously drawn as blanks will now appear
|
||
as their proper 5×7 glyphs. Mixed-case strings ("Core Select",
|
||
"Load Cart", "P1:") and the punctuation in parens / commas /
|
||
slashes used by version strings and timestamps render legibly.
|
||
|
||
Style note: the glyphs deliberately stay inside a 5×7 box
|
||
within the 8×8 cell so adjacent cells have a one-pixel gutter
|
||
on the right and bottom — the retro-monitor look the seed
|
||
font established in Ch231 is preserved at the higher coverage.
|
||
|
||
#### Tile RAM / CDC / bridge unchanged
|
||
|
||
Ch242 is **font-ROM-content-only** — no change to
|
||
`tile_ram_cdc.sv`, no change to `ps2_hps_bridge.sv` decode,
|
||
no change to the OSD register family, no change to the
|
||
overlay engine's renderer logic. Synthesis attributes
|
||
(`(* romstyle = "M20K" *)` on `font_rom`) carry over from
|
||
Ch232 so the larger ROM still infers a BRAM and the
|
||
LAB-budget headroom from Ch232 is preserved (a 2048×8 ROM
|
||
fits in a single M20K block either way).
|
||
|
||
#### Regression
|
||
|
||
149 → 155 unchanged (no new TB files; Ch234–Ch240 had already
|
||
bumped the count). `tb_osd_overlay_stub` errors=0.
|
||
|
||
### Ch243 — hardware OSD text validation with Ch242 glyph set (landed)
|
||
|
||
Built the .core.rbf with Ch242 font ROM on disk, loaded
|
||
through the normal retrodesd path, and observed the real
|
||
supervisor/OSD menu on the monitor.
|
||
|
||
#### What works
|
||
|
||
- **All Ch242 lowercase glyphs render legibly** — `etroDE`,
|
||
`ores`, `ain`, `raphics`, `em`, `pc` etc. all appear as
|
||
proper 5×7 shapes. No blank-cell gaps anywhere in visible
|
||
text.
|
||
- **Punctuation works** — parens `(` `)` and dash `-` render
|
||
correctly (visible in `(Main Sy…` and `RetroDE - Co…`).
|
||
- **Color split honored** — header line yellow, body lines
|
||
white, matching retrodesd's palette choice.
|
||
- **PS2 quadrant visible behind overlay** — Ch229 transparency
|
||
contract intact; zero-cells outside the menu text show the
|
||
red/green/blue/white quadrant pattern through.
|
||
- **OSD_CTRL[0] show/hide** — `osd_test --off` cleanly
|
||
removes the entire overlay, Ch230 path uninfluenced by the
|
||
larger font ROM.
|
||
- **No missing glyphs in the strings retrodesd currently
|
||
renders** — the Ch242 character-set audit covered the
|
||
actual menu vocabulary; no hunt-and-fill follow-up
|
||
needed for glyph coverage itself.
|
||
|
||
#### What surfaced: region overflow (not a Ch242 regression)
|
||
|
||
The OSD region is 160×48 px = **20 cells × 6 rows** (see
|
||
`osd_overlay_stub.sv` geometry constants), and retrodesd's
|
||
menu strings are wider than 20 cells:
|
||
|
||
| Intended string | Length | What renders on screen |
|
||
|-------------------------|--------|------------------------|
|
||
| `RetroDE - Cores` | 15 | `RetroDE - Co` (12 visible — truncated) |
|
||
| `retroDE (Main System)` | 21 | `retroDE (Main Sy` (16 visible — truncated) |
|
||
| `PS2 Graphics Demo` | 17 | `PS2 Graphics Dem` (16 visible — truncated) |
|
||
| `ao486 (x86 pc)` | 14 | `ao486 (x86 pc)` (fits) |
|
||
|
||
Truncation happens at the rightmost cell column (cell_x = 19);
|
||
the overlay's region clamp drops any character past that
|
||
cleanly — there is no wrap, no scroll, no horizontal-marquee
|
||
behavior. The portion of the menu the operator actually sees
|
||
is in the **upper-left red quadrant** because the OSD region
|
||
anchors at (0, 0) of active video and the test pattern's red
|
||
quadrant happens to overlap it.
|
||
|
||
This is **not** a Ch242 regression — the region geometry has
|
||
been unchanged since Ch228/Ch229 (when it was sized to a
|
||
"deliberately small" sub-screen overlay to validate the
|
||
transparency contract). It only became visibly limiting in
|
||
Ch243 because the font expansion finally made the truncated
|
||
characters legible enough to notice that they were
|
||
truncated.
|
||
|
||
#### Boundary call after Ch243
|
||
|
||
> **Ch242's glyph expansion shipped on silicon, ABI v1.0 OSD
|
||
> register path proven end-to-end with real retrodesd menu
|
||
> text, OSD show/hide unaffected. The next menu-usability
|
||
> gap is geometric (160-px-wide region too narrow for
|
||
> retrodesd's actual menu vocabulary), not typographic.
|
||
> That's a separate-chapter decision: expand region vs.
|
||
> shorten strings vs. horizontal scroll vs. marquee
|
||
> auto-scroll — each with its own tradeoffs in BRAM area,
|
||
> overlay opacity over PS2 image, and CDC plumbing.**
|
||
|
||
Candidate Ch244+ directions (deferred for Codex framing):
|
||
|
||
- **Wider OSD region.** Bump from 160 cells to e.g. 256 or
|
||
320 — easy to flip in `osd_overlay_stub.sv` constants but
|
||
costs more tile-RAM BRAM (and the Ch232 LAB-budget
|
||
headroom should be re-checked at the new size). Trade-off:
|
||
more PS2 image obscured behind menu.
|
||
- **Shorter strings in retrodesd.** Cheapest path — abbreviate
|
||
to ≤ 20 chars/line ("RetroDE — Cores" → "Cores", etc.).
|
||
No RTL or CDC work. Trade-off: less menu information
|
||
density, requires `retroDE_splash` source-side changes
|
||
outside this repo.
|
||
- **Per-line horizontal scroll on select.** Renders only
|
||
the highlighted line wider (auto-marquee). New CDC + new
|
||
control register; substantial scope.
|
||
- **Cursor / selection highlight** — Codex's backup-Ch243
|
||
pick. Doesn't address the truncation, but is the next
|
||
menu-usability gap after readability. Could land in
|
||
parallel.
|
||
|
||
### Ch244 — widen OSD region from 160×48 to 256×64 (landed)
|
||
|
||
Ch243's hardware validation confirmed that the only menu-text
|
||
ergonomic gap left was geometric: real retrodesd menu strings
|
||
exceeded the 20-cell-wide region. Ch244 widens the OSD overlay
|
||
from **160×48 px (20×6 cells)** to **256×64 px (32×8 cells)**
|
||
so the typical strings fit cleanly.
|
||
|
||
Per Codex's Ch244 framing, scope is region constants + TB
|
||
boundaries only — no tile-RAM resize, no CDC change, no
|
||
ABI register touch, no shorter-strings push back into
|
||
`retroDE_splash`.
|
||
|
||
#### Why these dimensions, not bigger
|
||
|
||
| Aspect | 160×48 (Ch228–Ch243) | 256×64 (Ch244) | Comment |
|
||
|---|---|---|---|
|
||
| Cells (cols × rows) | 20 × 6 = 120 | 32 × 8 = 256 | Both fit easily in the 2048-cell ABI window. |
|
||
| Storage (32-bit words, sibling stride 32/row) | up to row 5 × 32 + 15 = 175 | up to row 7 × 32 + 15 = 239 | Both fit in the 1024-word tile RAM. |
|
||
| Fraction of typical 640×480 active picture | ~2.5% | ~5.3% | Still leaves majority of PS2 image visible. |
|
||
| Pre-truncation visible width | 20 chars | 32 chars | Covers "retroDE (Main System)" (21), "PS2 Graphics Demo" (17), "RetroDE - Cores" (15). |
|
||
|
||
Going wider (e.g., 320×80 = 40×10) was considered but rejected:
|
||
each extra row eats more of the PS2 image, and 32 columns
|
||
already covers retrodesd's current vocabulary with a few cells
|
||
of headroom. Cursor/selection highlight (Codex's Ch245 backup)
|
||
can revisit width later if it turns out to need padding.
|
||
|
||
#### What changed (RTL)
|
||
|
||
- [rtl/platform/osd_overlay_stub.sv](../../rtl/platform/osd_overlay_stub.sv) — `parameter int OSD_W` default
|
||
bumped from 160 to 256, `OSD_H` from 48 to 64. The
|
||
`tile_rd_index = {1'b0, tile_row[3:0], tile_col[5:1]}`
|
||
generation is unchanged — already 10-bit / 1024-word
|
||
capable since Ch229. Region-clamp inequality math
|
||
(`x_cnt < OSD_X + OSD_W`, `y_cnt < OSD_Y + OSD_H`) carries
|
||
the new constants directly.
|
||
- [rtl/top/de25_nano_psmct32_raster_demo_top.sv](../../rtl/top/de25_nano_psmct32_raster_demo_top.sv) — the silicon top's
|
||
`osd_overlay_stub` instantiation override updated from
|
||
`.OSD_W(160), .OSD_H(48)` to `.OSD_W(256), .OSD_H(64)`.
|
||
Everything else in the top stays the same:
|
||
`overlay_tile_rd_index` (10-bit), `overlay_tile_rd_data`
|
||
(32-bit), `overlay_enable` (Ch230 OSD_CTRL[0] sync) all
|
||
unchanged.
|
||
|
||
#### What did NOT change
|
||
|
||
- Tile RAM size — bridge still owns 1024 × 32-bit
|
||
(4 KB at 0x1000..0x1FFF). 64-cell sibling-ABI row stride
|
||
preserved.
|
||
- `tile_ram_cdc` size + CDC plumbing — 1024-word shadow,
|
||
same write-event toggle path.
|
||
- HPS-visible ABI register family (OSD_CTRL/CFG/STATUS/TRIGGER) —
|
||
identical.
|
||
- retrodesd software side — already iterates over its source
|
||
string length; on the wider region it now fills more cells
|
||
per row instead of truncating at cell_x=19. No
|
||
`retroDE_splash` changes required.
|
||
- Font ROM — Ch242 character set untouched.
|
||
- Renderer (8×8 cells, fg/bg/char fields, transparency) —
|
||
bit-identical.
|
||
|
||
#### TB updates
|
||
|
||
[sim/tb/platform/tb_osd_overlay_stub.sv](../../sim/tb/platform/tb_osd_overlay_stub.sv) sees:
|
||
|
||
- Active picture sized up from 200×60 to 272×72 — wide
|
||
enough to put (256, 0) and (0, 64) inside active video
|
||
for the new outside-of-region checks.
|
||
- All three DUT instantiations now use `.OSD_W(256), .OSD_H(64)`.
|
||
- §3 gains a new inside-corner check at (255, 63) — must
|
||
read BG_R when all tiles zero (transparent).
|
||
- §8 comment updated — cell (19, 5) is no longer the bottom-
|
||
right corner, it's now an interior cell. The rendering
|
||
invariant being asserted is unchanged.
|
||
- §9 outside-edge checks moved: (160, 0) → (256, 0); (0, 48)
|
||
→ (0, 64). The old edges are now mid-region and tested by
|
||
§3 / §15 alongside the new behavior.
|
||
- New §15 "Ch244 corner cell" — writes 'X' (0x58) at cell
|
||
(31, 7), occupying pixels x∈[248,255] y∈[56,63]. Samples
|
||
`X` row 0 col 0 (bg), col 1 (fg), row 3 col 3 (fg), and
|
||
row 7 col 7 (bg) to prove the new corner cell decodes,
|
||
renders, and respects the row-7 blank convention. Also
|
||
re-verifies §9-style outside checks at (256, 56) and
|
||
(248, 64) with an *opaque* corner cell present (the
|
||
old §9 only verified outside-behavior against zero cells).
|
||
- Watchdog bumped from 5 ms to 10 ms — the larger frame
|
||
(288×78 vs 216×66) and additional §15 frame-wrap waits
|
||
pushed the run from ~4.3 ms to ~7.6 ms.
|
||
|
||
#### Expected operator-visible behavior
|
||
|
||
After loading the new .core.rbf:
|
||
|
||
- The OSD region in the upper-left of the test pattern grows
|
||
from a small box covering part of the red quadrant to a
|
||
larger box that **may extend into the green quadrant**
|
||
(~256 px wide vs ~320 px per quadrant in a 640×480
|
||
picture).
|
||
- Previously-truncated menu strings render fully:
|
||
- `RetroDE - Cores` (15 chars) — was truncated at "Co",
|
||
now fully visible
|
||
- `retroDE (Main System)` (21 chars) — was truncated at
|
||
"Sy", now fully visible
|
||
- `PS2 Graphics Demo` (17 chars) — was truncated at "Dem",
|
||
now fully visible
|
||
- Menus may use up to 2 more vertical lines (6 → 8 rows)
|
||
if retrodesd populates them.
|
||
- OSD show/hide via `osd_test --off` still works (Ch230 path
|
||
untouched).
|
||
- PS2 quadrant pattern still visible where the overlay is
|
||
transparent (Ch229 contract untouched).
|
||
|
||
#### Regression
|
||
|
||
155 PASS / 0 FAIL / 0 errors. `tb_osd_overlay_stub` PASS at
|
||
~7.6 ms sim time. Top-level `tb_de25_nano_psmct32_raster_demo_top`
|
||
unaffected (it doesn't sample inside the overlay region).
|
||
|
||
### Ch245 — adopt shared platform OSD (landed)
|
||
|
||
After the Ch243/Ch244 screenshot comparison surfaced that
|
||
retroDE_ps2 was running a divergent PS2-local OSD stack
|
||
(custom overlay, hand-coded ASCII font, incompatible cell
|
||
attribute layout) instead of the shared
|
||
`retroDE_splash/rtl/platform/` OSD that every working sibling
|
||
core uses, Ch245 migrates retroDE_ps2 back onto the canonical
|
||
platform OSD with **no feature expansion**.
|
||
|
||
#### What's referenced now (not copied)
|
||
|
||
Both `retroDE_ps2.qsf` and `sim/Makefile` now reference the
|
||
platform files. Path note: Quartus resolves QSF file paths
|
||
relative to the **QSF's own directory** (not the project
|
||
root); the existing `rtl/...` refs only work because there's
|
||
a `rtl → ../../../rtl` symlink in the synth tree. So the
|
||
shared-platform refs use four-up:
|
||
`../../../../retroDE_splash/rtl/platform/`. The sim Makefile
|
||
uses an absolute path via the `SPLASH_PLATFORM_RTL` variable,
|
||
unaffected by the synth-dir nesting.
|
||
|
||
- `osd_overlay.sv` — the real compositor (configurable
|
||
position/size/scale, CP437 font, per-cell `transp_bg`,
|
||
cursor-row highlight, 3-cycle pipeline)
|
||
- `osd_menu_fsm.sv` — Select+Start hold detect + D-pad
|
||
navigation + A/B action pulses (CLK_FREQ_HZ=50_000_000 to
|
||
match our sys_clk = CLOCK2_50)
|
||
- `osd_font_rom.sv` + `cp437_8x8.mem` — 256-glyph CP437 font
|
||
with line-drawing chars for the menu border
|
||
|
||
The sim Makefile creates a `sim/traces/rtl/rtl/platform`
|
||
symlink in its `dirs:` target so the platform font ROM's
|
||
CWD-relative `$readmemh("rtl/platform/cp437_8x8.mem")`
|
||
succeeds at simulation time. For Quartus synthesis, a
|
||
companion symlink lives at
|
||
`rtl/platform/cp437_8x8.mem → ../../../retroDE_splash/rtl/platform/cp437_8x8.mem`
|
||
in this repo (mirrors how NES does it at
|
||
`retroDE_nes/rtl/platform/cp437_8x8.mem`). Quartus's CWD at
|
||
elaboration is the QSF directory, the `rtl/` synth-tree
|
||
symlink resolves to `retroDE_ps2/rtl/`, and then the per-file
|
||
symlink takes the `$readmemh` to the real splash file. The
|
||
QSF's `MIF_FILE` line is a no-op for `$readmemh` but keeps
|
||
the file in Quartus's project view for programmer/IP flows.
|
||
|
||
#### Top-level wiring ([rtl/top/de25_nano_psmct32_raster_demo_top.sv](../../rtl/top/de25_nano_psmct32_raster_demo_top.sv))
|
||
|
||
Inserted between the existing demo wrapper and the HDMI pins:
|
||
|
||
1. **Pixel-coord counter** derived from `demo_video_de` /
|
||
`demo_video_hsync` / `demo_video_vsync` — feeds the
|
||
platform overlay's `pixel_x` / `pixel_y` inputs.
|
||
2. **Char-BRAM read adapter** — translates the platform's
|
||
11-bit `char_rd_addr` `{row[4:0], col[5:0]}` into our
|
||
existing `tile_ram_cdc` 10-bit 32-bit-word index +
|
||
low/high half mux on bit 0 of the cell address.
|
||
Registered once to match the platform's 1-cycle BRAM
|
||
latency expectation (same as NES at `retroDE_nes.sv:1267`).
|
||
3. **CDC stage** — 2-FF synchronizers carry
|
||
`bridge_osd_cfg0`/`cfg1`, `menu_osd_active`, and
|
||
`menu_cursor_row` from CLOCK2_50 into design_clk.
|
||
4. **Field decode** — sibling-ABI layout: `cols=cfg0[5:0]`,
|
||
`rows=cfg0[12:8]`, `osd_x=cfg0[23:16]<<4`,
|
||
`osd_y=cfg0[31:24]<<4`, `cursor_attr=cfg1[23:16]`.
|
||
5. **`osd_menu_fsm`** instantiated on CLOCK2_50, fed by
|
||
`bridge_input_p1` (the joypad bitmap retrodesd writes to
|
||
`INPUT_P1` at 0x040) and the OSD enable / force-open /
|
||
force-close bits from OSD_CTRL.
|
||
6. **Platform `osd_overlay`** drives `VIDEO_R/G/B/DE/HSYNC/VSYNC`
|
||
directly. `osd_global_transparent_bg=1'b0` and
|
||
`osd_scale=3'd2` hardwired per sibling-core convention.
|
||
|
||
Old `osd_overlay_stub` instantiation stays in the top but is
|
||
unwired from `VIDEO_*` (drives throwaway `stub_out_*` and
|
||
reads a tied-zero shadow). Module stays in the QSF and
|
||
Makefile so retroDE_ps2-local TBs keep linking. Ch246 will
|
||
remove it once on-monitor parity is proven.
|
||
|
||
#### Bridge reshape ([rtl/platform/ps2_hps_bridge.sv](../../rtl/platform/ps2_hps_bridge.sv))
|
||
|
||
Sibling-ABI semantics added in-place. No address-map changes;
|
||
existing offsets (0x100/04/08/10/14) are preserved.
|
||
|
||
- New 32-bit outputs `osd_ctrl_o` / `osd_cfg0_o` / `osd_cfg1_o`
|
||
so the top extracts fields per the sibling-shared layout
|
||
(matching `nes_hps_bridge.sv`).
|
||
- New menu-FSM-side inputs:
|
||
`osd_active_i`, `osd_cursor_row_i`,
|
||
`osd_set_trigger_i` (A button pulse),
|
||
`osd_back_trigger_i` (B button pulse),
|
||
`osd_scroll_down_trigger_i`, `osd_scroll_up_trigger_i`,
|
||
`osd_open_trigger_i`, `osd_trigger_row_i`.
|
||
- `OSD_STATUS` (0x104) read returns
|
||
`{19'd0, osd_cursor_row, 7'd0, osd_active}` exactly like
|
||
`nes_hps_bridge.sv:1122` instead of the prior always-zero
|
||
Ch224 sink.
|
||
- `OSD_TRIGGER` (0x108) becomes a real R/W register: bits
|
||
set by menu-FSM pulses, cleared by HPS via W1C. Sets win
|
||
over clears on common bits.
|
||
- `OSD_CTRL[3]` is a self-clearing "request" bit (matches
|
||
`nes_hps_bridge.sv:741`).
|
||
- Bridge clock IS sys_clk here (both ride CLOCK2_50), so the
|
||
FSM-side pulses are same-domain combinational inputs — no
|
||
synchronizer needed on the bridge end.
|
||
- The existing `osd_ctrl_enable` output is kept for back-compat
|
||
with the stub path until Ch246 cleanup.
|
||
|
||
#### What survived from Ch222..Ch244
|
||
|
||
- Bridge **register file infrastructure** (decode, AXI lane
|
||
alignment, write-accept FSM, the OSD-window guard at
|
||
`aw_addr_q[37:5] == 33'h08`) — still the right shape.
|
||
- Bridge **tile RAM 0x1000..0x1FFF** + the `tile_wr_*`
|
||
broadcast — the platform's char BRAM uses a different
|
||
storage shape but our 32-bit-word writes still land
|
||
correctly; the Ch245 adapter handles the unpack on read.
|
||
- `tile_ram_cdc.sv` — kept; the design-domain shadow it
|
||
produces is still the source for the platform overlay's
|
||
char reads (after the 32→16 adapter).
|
||
- All Ch234..Ch241 input-arc work (IOP pad-state, SIF DMA,
|
||
EE-side buffer + branch program) — orthogonal to OSD,
|
||
fully intact.
|
||
|
||
#### What was scaffolding (now bypassed in the synth path)
|
||
|
||
- `osd_overlay_stub.sv` — unwired from VIDEO_* but still
|
||
instantiated for TB linkage.
|
||
- Hand-coded ASCII font ROM that grew through Ch232/Ch242 —
|
||
replaced by `cp437_8x8.mem`.
|
||
- Ch244 region widening — superseded by the platform's
|
||
runtime-configurable `osd_x/y/cols/rows` (retrodesd writes
|
||
the position/size at boot).
|
||
- Custom cell-attribute layout (`{bg[3:0], fg[3:0], char}`) —
|
||
replaced by sibling-ABI layout
|
||
(`{transp_bg, bg[2:0], fg[3:0], char}`).
|
||
|
||
#### Verification
|
||
|
||
- New focused TB
|
||
[`sim/tb/platform/tb_osd_platform_cell_adapter.sv`](../../sim/tb/platform/tb_osd_platform_cell_adapter.sv)
|
||
exercises the 32→16 char-BRAM read adapter — the unique
|
||
integration glue we added. Pre-populates the shadow with
|
||
known cells in low half (col[0]=0), high half (col[0]=1),
|
||
mid-row, bottom-right corner (col=31, row=7), and reads
|
||
them back through the adapter mux + 1-cycle register.
|
||
Also verifies neighbors stay zero.
|
||
- Sibling-ABI bridge changes verified by the existing
|
||
`tb_ps2_hps_bridge` focused TB (extended with declarations
|
||
for the new ports; semantic verification of the new
|
||
OSD_STATUS/OSD_TRIGGER paths is deferred to Ch246 since
|
||
the read/write shape isn't on the Ch245 critical path).
|
||
- Full regression: **156 PASS / 0 FAIL** (155 prior + 1 new
|
||
adapter TB). The three input-arc integration TBs were
|
||
patched to declare the new bridge ports (`.*` wildcard
|
||
bindings now find them).
|
||
|
||
#### Expected operator-visible behavior on hardware
|
||
|
||
- The PS2 core's OSD should now **look identical** to every
|
||
sibling core's OSD: centered on screen (position controlled
|
||
by retrodesd via OSD_CFG0), white double-line border,
|
||
solid blue panel background, cyan/green cursor highlight
|
||
on the active row, full menu strings rendered with no
|
||
truncation (overlay can grow up to 63×31 chars), `A: select`
|
||
action prompt, `(active)` annotation on the loaded core.
|
||
- Behind the OSD where retrodesd writes `transp_bg=1` cells
|
||
(typically the title row's background or specific
|
||
highlight regions), the PS2 video quadrant test pattern
|
||
shows through.
|
||
- `osd_test --off` and the Select+Start hold combination
|
||
should both still hide/show the overlay.
|
||
- `ps2_status.sh --delta` should report identical INPUT_*
|
||
values to before.
|
||
|
||
#### What's left for Ch246
|
||
|
||
- Remove `osd_overlay_stub.sv` instantiation from the top
|
||
(and its .qsf line; Makefile reference stays only if
|
||
the focused stub TB stays).
|
||
- Decide the fate of the focused Ch228..Ch244 TBs targeting
|
||
the stub — keep as historical regression for the
|
||
deprecated module, or retire.
|
||
- Add focused-TB coverage for the new bridge OSD_STATUS /
|
||
OSD_TRIGGER R/W semantics.
|
||
- Drop the `osd_ctrl_enable` back-compat output once nothing
|
||
else reads it.
|
||
|
||
### Ch246 — CORE_CAPS advertises OSD geometry (landed)
|
||
|
||
After Ch245 brought the shared platform OSD online, the first
|
||
on-monitor look revealed that retrodesd is configuring the OSD
|
||
for a 1280×720 active picture (`OSD_CFG0 = 0x0E141068`:
|
||
cols=40 rows=16 origin=(320, 224) px at 2× scale → menu region
|
||
spans x∈[320, 960], y∈[224, 480]). The PS2 demo currently
|
||
outputs 640×480, so the right half of the menu lies past the
|
||
active picture and clips. retrodesd derives that origin from a
|
||
hardcoded 1280×720 assumption in its `rom_simple_backend.c`,
|
||
not from anything the core advertises.
|
||
|
||
Codex's Ch246 framing was explicit: **the actual geometry fix
|
||
lives in retrodesd's backend, not in this repo**. This chapter
|
||
contributes the metadata side — advertising the OSD geometry
|
||
in CORE_CAPS following the sibling-ABI bit layout so a
|
||
CORE_CAPS-aware retrodesd can use it directly.
|
||
|
||
#### CORE_CAPS bit layout (matches `nes_hps_bridge.sv:1024-1031`)
|
||
|
||
| Bits | Field | PS2 value | Notes |
|
||
|--------|----------------------|-----------|-------|
|
||
| 0 | has_save_ram | 0 | PS2 memory card not yet wired |
|
||
| 1 | has_savestates | 0 | Not yet implemented |
|
||
| 2 | two_player | 0 | P2 wiring is bringup-only |
|
||
| 3 | analog_input | 0 | DS2 analog sticks not plumbed |
|
||
| 7:4 | max_savestate_slots | 0 | — |
|
||
| 15:8 | osd_columns | 40 | Matches NES advertisement |
|
||
| 20:16 | osd_rows | 16 | Matches NES advertisement |
|
||
|
||
`CORE_CAPS = (16 << 16) \| (40 << 8) = 0x00102800`.
|
||
|
||
#### Runtime fix lives in retrodesd
|
||
|
||
This advertisement is forward-looking metadata. The actual
|
||
runtime bug — retrodesd writing `origin=(320, 224)` even when
|
||
the core can't accommodate it — needs a fix in
|
||
`retroDE_splash`'s board-side code (probably
|
||
`rom_simple_backend.c` or a PS2-specific backend), independent
|
||
of this RTL change. Two clean paths there:
|
||
|
||
1. **CORE_CAPS-aware OSD config**: read cols/rows from
|
||
CORE_CAPS bits [20:8] and derive origin from active-picture
|
||
dimensions retrodesd already knows (the same source it uses
|
||
for the HDMI mode).
|
||
2. **PS2-specific backend** that overrides the
|
||
`rom_simple_backend` defaults for the PS2 core's current
|
||
640×480 picture. Expected OSD_CFG0 for 640×480:
|
||
- cols=40, rows=16, scale=2
|
||
- x_chars = (640 - 40*16) / 32 = 0
|
||
- y_chars = (480 - 16*16) / 32 = 7
|
||
- → `OSD_CFG0 = 0x07001028` (scale bits in upper byte stay
|
||
zero since the bridge field isn't used for scale).
|
||
|
||
Neither belongs in retroDE_ps2; both belong in retroDE_splash.
|
||
|
||
#### ps2_status.sh
|
||
|
||
Updated to decode the new CORE_CAPS fields inline:
|
||
|
||
```
|
||
CORE_CAPS : 0x00102800 (osd=40x16 save=0 ss=0 2p=0 analog=0)
|
||
```
|
||
|
||
#### What was NOT done in Ch246
|
||
|
||
- **OSD origin clamp/override in the FPGA**: Codex's "quick
|
||
fallback" option of mutating retrodesd's runtime config in
|
||
the bridge or top was deliberately rejected — it silently
|
||
hides the upstream bug and breaks the moment retrodesd
|
||
ships a real fix.
|
||
- **Active-picture resolution bump**: PS2 emulation will
|
||
eventually need proper PCRTC scanout dimensions; treating
|
||
the OSD-clip symptom by widening the demo's video output
|
||
would just hide that planning. Tracked separately.
|
||
- **Gamepad input plumbing**: separate Ch247 per Codex.
|
||
Keyboard nav already works (Ch245 input path proven via
|
||
`INPUT_P1_RAW`); gamepad failing to update either INPUT_P1
|
||
or INPUT_P1_RAW is a retrodesd input-thread issue, not RTL.
|
||
|
||
#### Regression
|
||
|
||
156 PASS / 0 FAIL. `tb_ps2_hps_bridge` updated to expect
|
||
`CORE_CAPS = 0x00102800`.
|
||
|
||
### Ch248 — replace Ch226 DS2 stub with shared platform controller (landed)
|
||
|
||
Ch245 brought the platform OSD online; Ch246 advertised matching
|
||
geometry metadata; Ch247 fixed the on-monitor placement via a
|
||
PS2-specific backend in retroDE_splash. The Ch247 on-monitor run
|
||
exposed the last remaining input-path divergence: pressing the wired
|
||
DS2 controller didn't navigate the menu, because retroDE_ps2's Ch226
|
||
`DS2_STATUS` was a hardcoded stub that always reported "no controller
|
||
plugged in." retrodesd's `ds2_poll_thread` polls that register at
|
||
1 kHz; seeing `bit 0 = 0` it calls `osd_input_disconnect_ds2()` every
|
||
iteration and the DS2 source never contributes to `INPUT_P1_RAW`.
|
||
|
||
Ch248 replaces the stub with the same `ds2_controller` RTL that NES /
|
||
Atari2600 / splash use, and routes its outputs through the bridge.
|
||
|
||
#### Files added / changed
|
||
|
||
| File | Change |
|
||
|---|---|
|
||
| `synth/de25_nano/top_psmct32_raster_demo/de25_nano_psmct32_raster_demo_top.qsf` | New `SYSTEMVERILOG_FILE` ref to `../../../../retroDE_splash/rtl/platform/ds2_controller.sv` (four-up — see [QSF path resolution memory](../../../.claude/projects/-home-ubuntu-FPGA-Projects-retroDE-ps2/memory/reference_qsf_path_resolution.md)); pin assignments `PIN_H16/Y1/C2/P1` for `GPIO_0_DS2_CLK/CMD/DATA/ATTN` with `3.3-V LVCMOS` I/O and `WEAK_PULL_UP_RESISTOR ON` on DATA; entity is `de25_nano_psmct32_raster_demo_top` |
|
||
| `sim/Makefile` | Added `$(SPLASH_PLATFORM_RTL)/ds2_controller.sv` to `SHARED_RTL` |
|
||
| [rtl/top/de25_nano_psmct32_raster_demo_top.sv](../../rtl/top/de25_nano_psmct32_raster_demo_top.sv) | New top ports `GPIO_0_DS2_CLK/CMD/DATA/ATTN`; instantiated `ds2_controller #(.CLK_HZ(50_000_000))` on `CLOCK2_50`; wired `ds2_buttons_w / ds2_connected_w / ds2_error_w` into the bridge's new input ports |
|
||
| [rtl/platform/ps2_hps_bridge.sv](../../rtl/platform/ps2_hps_bridge.sv) | New input ports `ds2_buttons_i[31:0]`, `ds2_connected_i`, `ds2_error_i`. `DS2_STATUS` (0x0F0) now reads `{29'd0, 1'b1, ds2_error_i, ds2_connected_i}`. `DS2_BUTTONS` (0x0F4) now reads `ds2_buttons_i` — the Ch226 `INPUT_P1` mirror is gone (it was actively blocking the real shared-runtime path) |
|
||
| [sim/tb/platform/tb_ps2_hps_bridge.sv](../../sim/tb/platform/tb_ps2_hps_bridge.sv) | §15 rewritten: drives `ds2_buttons_i/connected_i/error_i` from the TB and verifies live readback for plugged/error/unplugged states. The "DS2_BUTTONS mirrors INPUT_P1" tests were inverted — they now confirm INPUT_P1/P2/P1_RAW writes do **NOT** disturb DS2_BUTTONS |
|
||
| Three integration TBs using `.*` wildcard binding | Added `ds2_buttons_i/connected_i/error_i = 0` declarations so the wildcard finds them |
|
||
| [docs/hardware/ps2_status.sh](ps2_status.sh) | Removed "expect 0x00000004" comment; now decodes `[0]=connected`, `[1]=error`, and prints a "plug a controller in" hint when bit 0 is clear |
|
||
|
||
#### DS2_STATUS bit layout (new)
|
||
|
||
| Bit | Field | Source |
|
||
|---|---|---|
|
||
| 0 | connected | `ds2_controller.ds2_connected` |
|
||
| 1 | error | `ds2_controller.ds2_error` |
|
||
| 2 | reserved | hardcoded `1` (PS2-local legacy bit retained for operator-tool compatibility) |
|
||
| 31:3 | reserved | `0` |
|
||
|
||
#### DS2_BUTTONS
|
||
|
||
`32-bit live readback from ds2_controller.ds2_buttons`. The Ch226
|
||
mirror of INPUT_P1 is removed — that was useful as a stub
|
||
"proof-of-bridge-write-landing" but blocked the real path. INPUT_P1
|
||
remains the HPS-written output of retrodesd's input normalization;
|
||
DS2_BUTTONS is now the raw wired-pad readback that
|
||
`ds2_poll_thread` consumes upstream of that.
|
||
|
||
#### Acceptance criteria (Codex Ch248 framing)
|
||
|
||
| Criterion | Result |
|
||
|---|---|
|
||
| Controller unplugged: `DS2_STATUS[0]=0`, no runtime regression | ✓ TB §15 verifies; default state |
|
||
| Controller plugged: `DS2_STATUS[0]=1`, `DS2_STATUS[1]=0` | ✓ TB §15 verifies (`@connected` check) |
|
||
| Pressing D-pad/A/B changes `DS2_BUTTONS` | ✓ TB §15 verifies via `ds2_buttons_i` updates; hardware proof on next compile |
|
||
| retrodesd writes INPUT_P1_RAW; gamepad navigation works | Hardware verification deferred to user's compile + on-monitor test |
|
||
| Existing keyboard nav still works | ✓ Unchanged (keyboard path is independent) |
|
||
| Sim bridge TB updated from Ch226 expectations | ✓ TB §15 fully rewritten |
|
||
|
||
#### Pin map (matches NES / Atari2600 / splash exactly)
|
||
|
||
| Signal | DE25-Nano pin | I/O standard | Notes |
|
||
|---|---|---|---|
|
||
| `GPIO_0_DS2_CLK` | PIN_H16 | 3.3-V LVCMOS | output |
|
||
| `GPIO_0_DS2_CMD` | PIN_Y1 | 3.3-V LVCMOS | output |
|
||
| `GPIO_0_DS2_DATA` | PIN_C2 | 3.3-V LVCMOS | input, weak pull-up (open-drain controller line) |
|
||
| `GPIO_0_DS2_ATTN` | PIN_P1 | 3.3-V LVCMOS | output |
|
||
|
||
#### Regression
|
||
|
||
156 PASS / 0 FAIL. `tb_ps2_hps_bridge` §15 fully covers the new live
|
||
readback path. Three integration TBs touched only for wildcard
|
||
binding; their semantics unchanged.
|
||
|
||
#### What's NOT in Ch248
|
||
|
||
- The Ch234 sio2_input_stub (sim-only IOP-side controller decoder
|
||
for the PS2 SIF/IOP gameplay path) is independent of the OSD
|
||
navigation path Ch248 fixes. Both will eventually be relevant
|
||
when real games run on the EE/IOP, but they handle different
|
||
problems.
|
||
- `ds2_analog[31:0]` and the controller's debug outputs are tied to
|
||
unused wires in the top. retrodesd doesn't read analog yet for
|
||
PS2; once it does, surfacing them through the bridge would be a
|
||
small follow-up.
|
||
|
||
### Ch249 — canonical PS2 plug-in integration (landed)
|
||
|
||
Ch245→Ch248 converged the OSD / DS2 / backend stack onto the
|
||
shared platform path that every working retroDE core uses. This
|
||
section is the **canonical state of the integration as of Ch249**;
|
||
the Ch228–Ch244 sections above are kept as historical record but
|
||
no longer describe what's on silicon.
|
||
|
||
**Document-reading guide:** anything in this Ch249 section is
|
||
load-bearing for current silicon. Ch228–Ch244 OSD content (the
|
||
PS2-local `osd_overlay_stub.sv`, the Ch232/Ch242/Ch244 stub
|
||
geometry/glyph chapters, the Ch226 hardcoded DS2 stub, etc.) is
|
||
historical context only — the artifacts those chapters built were
|
||
retired in Ch249 as scaffolding.
|
||
|
||
#### Stack diagram
|
||
|
||
```
|
||
┌─────────────────────────────────────────────────────────────────┐
|
||
│ HPS (retrodesd) │
|
||
│ ├─ ps2_backend.c ← backend=ps2 in manifest │
|
||
│ │ └─ writes OSD_CFG0=0x07001028 + CTRL/CFG1 at start() │
|
||
│ ├─ input_thread.c ← evdev (keyboard) → osd_input │
|
||
│ ├─ ds2_poll_thread.c ← polls DS2_STATUS @ 1 kHz │
|
||
│ └─ osd_input.c ← merges all sources, writes │
|
||
│ INPUT_P1 / INPUT_P1_RAW │
|
||
├─────────────────────────────────────────────────────────────────┤
|
||
│ AXI bridge (`ps2_hps_bridge.sv`) │
|
||
│ ├─ 0x040/0x044 INPUT_P1/P2 (HPS write → bridge_input_p1/p2) │
|
||
│ ├─ 0x048 INPUT_P1_RAW (HPS write → menu FSM joypad) │
|
||
│ ├─ 0x0F0 DS2_STATUS ← {ds2_error, ds2_connected} │
|
||
│ ├─ 0x0F4 DS2_BUTTONS ← live ds2_buttons │
|
||
│ ├─ 0x100/04/08/10/14 OSD_* (full 32-bit values out) │
|
||
│ └─ 0x1000–0x1FFF tile RAM (HPS write → tile_wr_* broadcast)│
|
||
├─────────────────────────────────────────────────────────────────┤
|
||
│ Top │
|
||
│ ├─ `ds2_controller` ← GPIO_0_DS2_{CLK,CMD,DATA,ATTN} │
|
||
│ │ (shared from retroDE_splash, Ch248) │
|
||
│ ├─ `tile_ram_cdc` ← bridge tile writes → design-clk shadow │
|
||
│ ├─ Ch245 char-BRAM adapter (32→16 cell-select mux on read) │
|
||
│ ├─ `osd_menu_fsm` ← bridge_input_p1_raw, OSD_CTRL[0/2/3] │
|
||
│ │ (shared, Ch245) │
|
||
│ ├─ `osd_font_rom` ← cp437_8x8.mem (shared, Ch245) │
|
||
│ └─ `osd_overlay` ← compositor onto demo_video_* │
|
||
│ (shared, Ch245; drives HDMI_TX_*) │
|
||
└─────────────────────────────────────────────────────────────────┘
|
||
```
|
||
|
||
#### Sources of truth (where to read instead of re-deriving)
|
||
|
||
| Topic | Canonical file | Notes |
|
||
|---|---|---|
|
||
| Bridge register map + DS2 + OSD ports | [`rtl/platform/ps2_hps_bridge.sv`](../../rtl/platform/ps2_hps_bridge.sv) | Ch245 + Ch248 ports; older Ch222–Ch227 + Ch230 history visible in comments |
|
||
| Top wiring | [`rtl/top/de25_nano_psmct32_raster_demo_top.sv`](../../rtl/top/de25_nano_psmct32_raster_demo_top.sv) | Ch245 platform-OSD instantiation + Ch248 `ds2_controller` instantiation |
|
||
| PS2 backend (HPS-side) | `retroDE_splash/software/ps2_backend.c` | Ch247; manifest `backend=ps2` |
|
||
| Pin assignments | [`...top_psmct32_raster_demo_top.qsf`](../../synth/de25_nano/top_psmct32_raster_demo/de25_nano_psmct32_raster_demo_top.qsf) | DS2 GPIO PIN_H16/Y1/C2/P1 with weak pull-up on DATA (Ch248) |
|
||
| OSD geometry advertisement | `CORE_CAPS = 0x00102800` | Ch246; `[15:8]=40` (cols), `[20:16]=16` (rows) |
|
||
| Status dump tool | [`docs/hardware/ps2_status.sh`](ps2_status.sh) | Decodes CORE_CAPS, OSD_*, DS2_* live values |
|
||
|
||
#### Operator quick-check (running on the board)
|
||
|
||
```bash
|
||
# Snapshot every register the canonical Ch245–Ch248 path touches:
|
||
sudo ./ps2_status.sh | grep -E "CORE_CAPS|OSD_|DS2_|INPUT_"
|
||
```
|
||
|
||
Healthy state (PS2 backend loaded, DS2 plugged in):
|
||
|
||
```
|
||
CORE_CAPS : 0x00102800 (osd=40x16 save=0 ss=0 2p=0 analog=0)
|
||
INPUT_P1 : 0x???????? (depends on what retrodesd's just merged)
|
||
INPUT_P1_RAW: 0x???????? (DS2 + keyboard merged)
|
||
DS2_STATUS : 0x0000000? ([0]=connected=1, [1]=error=0, [2]=reserved=1)
|
||
DS2_BUTTONS : 0x???????? (live decoded DS2 bitmap)
|
||
OSD_CTRL : 0x00000015 (enable + input_lock + force_open when no ROM)
|
||
OSD_STATUS : 0x00000?01 ([0]=osd_active, [12:8]=cursor_row)
|
||
OSD_CFG0 : 0x07001028 (cols=40 rows=16 origin=(0,7) chars at 2× scale)
|
||
OSD_CFG1 : 0x1F3F0303 (first/last_row=3, cursor_attr=0x3F)
|
||
```
|
||
|
||
#### Ch249 cleanup deltas (what changed in this chapter)
|
||
|
||
- **Deleted** `rtl/platform/osd_overlay_stub.sv` and its TB.
|
||
- **Removed** the stub instantiation + dead `stub_out_*` /
|
||
`stub_tile_rd_index` wires from the top.
|
||
- **Retired** the Ch230 single-bit `osd_ctrl_enable` bridge output
|
||
(its only consumer was the stub's `enable_i`; the platform OSD
|
||
reads `osd_ctrl_o[0]` out of the Ch245 32-bit register exports).
|
||
- **Removed** the `bridge_osd_ctrl_enable` wire + 3-FF sync block
|
||
+ sim-path tie-off from the top.
|
||
- **Updated** QSF: no more `osd_overlay_stub.sv` reference.
|
||
- **Updated** sim Makefile: dropped `tb_osd_overlay_stub` from
|
||
per-target rule, `.PHONY` list, and `run:` master list.
|
||
- **Cleaned** stale comments in `tile_ram_cdc.sv`, the top, and the
|
||
`tb_osd_platform_cell_adapter.sv` header to point at the Ch245
|
||
adapter as the canonical reader.
|
||
- **Updated** three integration TBs (`tb_bridge_iop_pad_input`,
|
||
`tb_ee_pad_buffer_branch`, `tb_pad_state_via_sif_to_ee`) to drop
|
||
the now-removed `osd_ctrl_enable` port from their wildcard
|
||
bindings.
|
||
|
||
#### Things deliberately kept (NOT scaffolding)
|
||
|
||
- `tile_ram_cdc.sv` — still in the live path. The Ch245 adapter
|
||
reads from its design-clock shadow. Could in principle be
|
||
replaced by a true dual-clock dual-port BRAM matching NES's
|
||
pattern, but the current toggle-CDC shadow is fine and works.
|
||
- `tb_osd_platform_cell_adapter.sv` — focused TB for the Ch245
|
||
32→16 cell-select mux. The unique integration logic that
|
||
retroDE_ps2 added on top of the platform OSD stack.
|
||
- The Ch234–Ch241 input arc (sio2_input_stub via SIF DMA into
|
||
EE-side buffer + branch TB chain) — independent of OSD nav.
|
||
Lives on in `tb_*_pad_*.sv` as the sim-only IOP→EE input path
|
||
that the eventual gameplay flow will need; not in the synth top
|
||
yet.
|
||
|
||
#### Regression
|
||
|
||
155 PASS / 0 FAIL (was 156 — `tb_osd_overlay_stub` deleted, no
|
||
coverage loss because `tb_osd_platform_cell_adapter` already
|
||
covers the live integration glue).
|
||
|
||
### Ch250 — sio2_input_stub reaches a fabric consumer on silicon (landed)
|
||
|
||
Ch234 built `sio2_input_stub` (the IOP-readable PS2 pad-state
|
||
register at retroDE-local MMIO 0x1F80_8500); Ch235 wired bridge
|
||
`INPUT_P1/P2/P1_RAW` outputs out into the top; Ch241 then
|
||
documented that those wires terminated at unconnected nets that
|
||
Quartus elided, because the synth top never instantiated any
|
||
fabric consumer. Ch250 ends that elision with the smallest
|
||
possible fabric proof: instantiate the stub, tap its
|
||
Sony-translated 16-bit pad word, and light three LEDs from three
|
||
chosen bits.
|
||
|
||
Per Codex's Ch250 framing, this is **fabric-consumer proof, not
|
||
real input architecture** — there's still no IOP execution path,
|
||
no libpad/SIF on silicon, and no new HPS-visible ABI. The point
|
||
is: the bitmap that retrodesd writes through the bridge now
|
||
reaches a live consumer that Quartus retains.
|
||
|
||
#### Files touched
|
||
|
||
| File | Change |
|
||
|---|---|
|
||
| [`rtl/iop/sio2_input_stub.sv`](../../rtl/iop/sio2_input_stub.sv) | Added `p1_sony_word_o[15:0]` and `p2_sony_word_o[15:0]` output ports — parallel taps of the existing internal `p1_word`/`p2_word` wires (the same source that feeds the 0x500/0x504 IOP read responses). No functional change to the read map. |
|
||
| [`rtl/top/de25_nano_psmct32_raster_demo_top.sv`](../../rtl/top/de25_nano_psmct32_raster_demo_top.sv) | Instantiated `sio2_input_stub` with `clk=CLOCK2_50`, `input_p1=bridge_input_p1_raw`, `input_p2=bridge_input_p2`. IOP-side read/write ports tied to zero (no IOP on silicon yet). Replaced the `LED[7:5] = 3'b111` tie-off with three Sony-word taps. Added sim-path tie-offs for `bridge_input_p1/p2/p1_raw` (the `else` branch when qsys isn't instantiated). |
|
||
| [`.qsf`](../../synth/de25_nano/top_psmct32_raster_demo/de25_nano_psmct32_raster_demo_top.qsf) | New `SYSTEMVERILOG_FILE rtl/iop/sio2_input_stub.sv` entry — pre-Ch250 the file was sim-only via the Makefile. |
|
||
|
||
#### LED → pad-bit mapping
|
||
|
||
Sony wire-format `p1_sony_word` layout (active-LOW, pressed = 0):
|
||
|
||
| Sony bit | Field | retroDE input bit | DE25 LED |
|
||
|---|---|---|---|
|
||
| `[3]` | START | retroDE bit 4 (START) | LED[5] |
|
||
| `[14]` | CROSS (×) | retroDE bit 7 (B-spatial) | LED[6] |
|
||
| `[4]` | D-pad UP | retroDE bit 3 (UP) | LED[7] |
|
||
|
||
Polarity chain works out by pass-through: Sony bit = 0 when pressed,
|
||
DE25 LED pin = 0 when lit, so `LED[N] = p1_sony_word[bit]` lights
|
||
the LED on press without an explicit invert.
|
||
|
||
#### Acceptance (Codex Ch250 framing)
|
||
|
||
| Criterion | Expected on hardware |
|
||
|---|---|
|
||
| Keyboard/gamepad through retrodesd changes `INPUT_P1_RAW` | Already confirmed in Ch248 |
|
||
| `sio2_input_stub` consumes that in fabric | New instantiation; verified by LED ledger |
|
||
| Holding selected buttons lights LEDs | Hold **START** → LED[5] lights; hold **×** → LED[6]; hold **D-pad UP** → LED[7]. Release → unlit |
|
||
| Existing OSD/gamepad nav still works | No video-path changes, no OSD-register changes, gamepad → menu nav path unchanged |
|
||
|
||
#### What's NOT in Ch250
|
||
|
||
- **No IOP execution.** sio2_input_stub's read port is tied to
|
||
zero — nothing on silicon is exercising the 0x1F80_8500 read
|
||
response. The Sony-word tap is a parallel observation, not the
|
||
IOP path.
|
||
- **No SIF / libpad / EE-side game consumption.** The Ch238–Ch240
|
||
sim-only chain stays sim-only.
|
||
- **No HPS-visible diagnostic register.** Codex explicitly
|
||
declined Option C ("creates another diagnostic register path
|
||
when the real destination is eventually IOP/libpad, not HPS").
|
||
- **No P2 support.** `p2_sony_word_o` is wired but
|
||
`bridge_input_p2` is whatever retrodesd writes (currently
|
||
always 0). Tied to no LEDs.
|
||
|
||
#### Regression
|
||
|
||
155 PASS / 0 FAIL. `sio2_input_stub` port addition is by-name
|
||
binding-friendly so existing TBs link without changes; sim-path
|
||
tie-offs for `bridge_input_p1/p2/p1_raw` were needed because
|
||
pre-Ch250 those wires went to unconnected nets in the sim top
|
||
and the X-propagation through the new stub-on-top would have
|
||
otherwise made `tb_de25_nano_psmct32_raster_demo_top` see X on
|
||
LED[7:5].
|
||
|
||
### Ch251 — animated PS2 demo (color bars + border + heartbeat) (landed)
|
||
|
||
Ch250 closed the input-on-silicon arc; Ch251 polishes the
|
||
hardware-visible side. The old Ch171 320×240 four-quadrant card was
|
||
a static one-shot — bootlet wrote four SPRITEs, SYSCALL'd, and the
|
||
screen never changed. Codex's Ch251 framing replaces it with a
|
||
richer animated demo that exercises the same GIF/raster path:
|
||
|
||
- **8 vertical color bars** (white / yellow / cyan / green / magenta /
|
||
red / blue / black), 40 px wide each, full screen height.
|
||
- **Grey 4-px border** around the 320×240 region.
|
||
- **Orange 8×8 corner-alignment markers** at all four corners.
|
||
- **One cyan/red heartbeat SPRITE** (16×16 at the center) whose RGBAQ
|
||
qword is rewritten by the EE bootlet's main loop, alternating colors.
|
||
|
||
#### Bake.py changes
|
||
|
||
All in [`sim/data/top_psmct32_raster_demo/bake.py`](../../sim/data/top_psmct32_raster_demo/bake.py):
|
||
|
||
- New MIPS opcode encoders: `enc_addiu`, `enc_bne`, `enc_j`,
|
||
`enc_xor`, `enc_andi`, `enc_lw`, `enc_nop`.
|
||
- New `build_ch251_sprites()` produces the 17-SPRITE list (8 bars + 4
|
||
border + 4 corners + 1 heartbeat).
|
||
- New `build_ch251_animated_bootlet()` replaces the one-shot
|
||
`bootlet_for_display1_hi()` for the production fixtures:
|
||
1. Initial setup (DISPFB1 / DISPLAY1 / PMODE / DMAC MADR+QWC) —
|
||
same shape as Ch171.
|
||
2. First DMAC kick.
|
||
3. Loop forever:
|
||
- Delay (~17M-iter busy counter ≈ 1 s at 50 MHz → 1 Hz blink).
|
||
- Poll DMAC `CHCR.start` until clear (safe re-arm).
|
||
- XOR heartbeat RGBAQ qword at `kseg0 + 0x730` between cyan and red.
|
||
- Re-arm MADR + QWC + CHCR.start.
|
||
- Jump back to loop head.
|
||
|
||
The Ch146 legacy 16×8 sim-only path is **unchanged** — it still uses
|
||
the one-shot SYSCALL bootlet because the smaller TBs (`tb_gs_*_e2e`
|
||
family) rely on that contract.
|
||
|
||
#### Fixture sizes after Ch251
|
||
|
||
| File | Active words | Padded to | Notes |
|
||
|---|---|---|---|
|
||
| `bios.mem` | 43 | 1024 | Animated bootlet (loops forever) |
|
||
| `payload.mem` | 102 qwords | 256 qwords | 17 SPRITEs × 6 qwords; heartbeat RGBAQ at byte 0x730 |
|
||
| `bios_ch146.mem` | 19 | 1024 | One-shot, halts (legacy) |
|
||
| `payload_ch146.mem` | 24 qwords | 256 qwords | 4 SPRITEs (legacy) |
|
||
|
||
#### TB changes
|
||
|
||
- [`tb_de25_nano_psmct32_raster_demo_top`](../../sim/tb/top/tb_de25_nano_psmct32_raster_demo_top.sv) —
|
||
removed the `wait (LED[0] == 0)` (would never fire under the loop).
|
||
Now waits for `LED[1]` (dma_done_seen) and `LED[2]` (frame_seen).
|
||
New explicit assertion: `LED[0] == 1` (unlit) at end of test, with a
|
||
pointed error message if it's lit ("core_halt high: animated bootlet
|
||
must loop, not SYSCALL").
|
||
- [`tb_top_psmct32_raster_demo_bram_ch171`](../../sim/tb/top/tb_top_psmct32_raster_demo_bram_ch171.sv) —
|
||
TB name kept (Makefile entry stable), assertions retargeted at the
|
||
Ch251 SPRITE layout. New `expected_*_at()` functions encode the
|
||
bars / border / corners. 10 probe coordinates sample non-heartbeat
|
||
regions so the assertions are independent of bootlet loop phase.
|
||
- All other TBs (`tb_top_psmct32_raster_demo`, `tb_top_psmct32_raster_demo_bram`,
|
||
the `tb_gs_*_e2e` family) consume the **Ch146** legacy fixtures
|
||
unchanged.
|
||
|
||
#### Acceptance (Codex Ch251 framing)
|
||
|
||
| Criterion | Result |
|
||
|---|---|
|
||
| Monitor shows richer pattern + obvious heartbeat | Hardware verification deferred to user's compile |
|
||
| Shared OSD still overlays correctly | No video-path changes; platform OSD path Ch245-Ch248 intact |
|
||
| Gamepad OSD nav still works | No bridge / input changes; Ch248 path intact |
|
||
| `RASTER_OVERFLOW_COUNT = 0` | Existing TB assertion still holds |
|
||
| Sim covers a few key pixels/states without exploding runtime | 10 probes inside the painted region; ~1.6 s sim time per run |
|
||
|
||
#### LED semantic flip
|
||
|
||
The pre-Ch251 success indicator was "LED[0..3] eventually lit." For
|
||
Ch251 the success indicator is:
|
||
- LED[0] **unlit** (EE running paint loop — `core_halt = 0`)
|
||
- LED[1..3] lit (DMAC done, PCRTC framing, HDMI configured)
|
||
- Visible 1 Hz heartbeat blink at screen center
|
||
- FRAME_COUNT advancing in `ps2_status.sh`
|
||
- RASTER_OVERFLOW_COUNT = 0
|
||
|
||
If LED[0] is LIT during the demo, the bootlet hit an unexpected
|
||
SYSCALL (decode mismatch on one of the new opcodes, branch target
|
||
miscalculation, etc.) — investigate the EE trace.
|
||
|
||
#### Regression
|
||
|
||
155 PASS / 0 FAIL. Top TB took ~1.6 s sim time (vs ~0.4 s for the
|
||
old 24-qword payload) due to the 102-qword Ch251 DMAC drain through
|
||
GS raster.
|
||
|
||
#### Ch251 second addendum — EE → RAM write path was unconnected
|
||
|
||
On-monitor test of the addendum-1 retrodesd change exposed the
|
||
*actual* reason the heartbeat wasn't blinking: `DMA_DONE_COUNT`
|
||
was climbing (~2 Hz) but the heartbeat stayed CYAN regardless.
|
||
Sim reproduction with a runtime probe on `ee_ram_stub.mem[115][31:0]`
|
||
confirmed it stayed at the initial CYAN payload value across the
|
||
entire 30 s sim watchdog.
|
||
|
||
Root cause: `top_psmct32_raster_demo_bram.sv` instantiated
|
||
`ee_ram_stub` with its write port tied to zero
|
||
(`.wr_en(1'b0), .wr_addr('0), .wr_data(128'd0), .wr_be(16'd0)`)
|
||
because the pre-Ch251 one-shot bootlet **only read** from EE-RAM
|
||
(DMAC sourcing the static payload). The corresponding output ports
|
||
of `ee_memory_map_stub` (`ram_wr_en/addr/data/be/master_id`) were
|
||
left **unconnected** at the wrapper. The Ch251 looping bootlet's
|
||
SW writes to `0x8000_0730` (heartbeat RGBAQ) decoded as RAM hits
|
||
correctly inside `ee_memory_map_stub` but vanished at the wrapper
|
||
boundary.
|
||
|
||
Fix: wire `ee_memory_map_stub.ram_wr_*` → `ee_ram_stub.wr_*` in
|
||
the demo wrapper. New local wires `ram_wr_en / ram_wr_addr /
|
||
ram_wr_data / ram_wr_be / ram_wr_master_id` connect the two. The
|
||
`ram_master_id` mux now selects writer-id when writing, reader-id
|
||
when reading.
|
||
|
||
Sim verification (5-sec sim time with delay temporarily shortened
|
||
to 256 iters): the EE issues SWs to `0x8000_0730` alternating
|
||
between `0xFF0000FF` (RED) and `0xFFFFFF00` (CYAN), `ee_ram_stub`
|
||
sees those writes at `wr_addr=0x00000730 wr_be=0x000F`, and
|
||
`mem[115][31:0]` toggles in lock-step. Delay restored to the
|
||
production 16M-iter value (~1 Hz at 50 MHz hardware).
|
||
|
||
The TB now spot-checks the initial CYAN state of the heartbeat
|
||
qword after first DMAC drain; the dynamic blink remains a
|
||
hardware-monitor proof (the production delay needs >30 s sim
|
||
time to catch a toggle, not worth the regression cost).
|
||
|
||
Files touched:
|
||
- [`rtl/top/top_psmct32_raster_demo_bram.sv`](../../rtl/top/top_psmct32_raster_demo_bram.sv) — new
|
||
`ram_wr_*` wires + `ee_memory_map_stub` → `ee_ram_stub` write-port
|
||
hookup. Master-id passthrough updated.
|
||
- [`sim/tb/top/tb_top_psmct32_raster_demo_bram_ch171.sv`](../../sim/tb/top/tb_top_psmct32_raster_demo_bram_ch171.sv) —
|
||
small heartbeat first-kick assertion added.
|
||
|
||
**Fitter follow-on**: enabling the EE-RAM write port blew the
|
||
Agilex 5 M20K budget (Quartus reported 516 needed vs 358
|
||
available). Pre-Ch251 the wrapper tied `wr_en=0` so Quartus
|
||
inferred `ee_ram_stub.mem` as ROM — packed into a few M20K blocks
|
||
implicitly. With the write path live, even with an explicit
|
||
`ramstyle = "M20K"` hint the synthesizer's inference on the
|
||
128-bit-wide / 16-byte-enable / dual-port-master backing came out
|
||
to 520 M20Ks — still 162 over budget. The ROM → byte-enable-RW
|
||
transition was substantially more expensive than a straight
|
||
"+14 M20Ks for write logic" estimate.
|
||
|
||
Final fix: **don't enable the full ee_ram_stub write port at
|
||
all, and don't consume ee_memory_map_stub's ram_wr_* output
|
||
either.** Both paths inflated the M20K count. Instead, the demo
|
||
wrapper snoops the EE's SW directly from `ee_cpu_wr_*` — the
|
||
EE core's existing output going into the memory map — captures
|
||
any full-word SW whose physical address is `0x0000_0730`
|
||
(strip-kseg-bit'd) into a 32-bit `hb_rgbaq_reg`, and splices
|
||
that register's value into the low 32 bits of the DMAC read
|
||
response when the DMAC fetches qword 115. `ee_ram_stub.wr_en`
|
||
stays tied to zero (Quartus continues to infer the memory as
|
||
ROM) and `ee_memory_map_stub.ram_wr_*` outputs are left
|
||
unconnected (so Quartus optimizes their behind-logic away). The
|
||
snoop is one 32-bit register + a 1-cycle delay flop + a 128-bit
|
||
2:1 mux. M20K cost: ~0 new blocks beyond pre-Ch251.
|
||
|
||
Files touched:
|
||
- [`rtl/top/top_psmct32_raster_demo_bram.sv`](../../rtl/top/top_psmct32_raster_demo_bram.sv) — added
|
||
`hb_rgbaq_reg`, `hb_write_hit` detector (sourced from
|
||
`ee_cpu_wr_*`), and the `ram_rd_data_patched` mux.
|
||
`ee_ram_stub.wr_*` stays tied-zero;
|
||
`ee_memory_map_stub.ram_wr_*` outputs revert to unconnected.
|
||
|
||
Sibling memories (`useg_shadow_mem` in `ee_memory_map_stub`,
|
||
`bios_rom_stub.mem`) were not touched. The Ch232 BRAM-inference
|
||
memory in `~/.claude/projects/.../memory/` flagged a similar
|
||
issue earlier for the font ROM — for narrow / single-port memories
|
||
the `ramstyle = "M20K"` hint rescues; for wide-byte-enable RW
|
||
backings even the hint isn't enough and the cleanest fix is to
|
||
keep the memory ROM-shaped and snoop writes outside it.
|
||
|
||
#### Ch251.4 — Fitter resource report points at VRAM, not EE-RAM
|
||
|
||
The Ch251.3 patch-register fix above was the right architectural
|
||
move (it eliminated the RW BRAM cost for EE-RAM) but recompiling
|
||
**still** reported 516 / 358 M20Ks. The Ch251.3 rework wasn't the
|
||
source of the overrun — it was a parallel correctness fix that
|
||
happened to land in the same chapter.
|
||
|
||
The real culprit surfaced in
|
||
`output_files/de25_nano_psmct32_raster_demo_top.fit.rpt`
|
||
(Compilation Report → Fitter → Place Stage → **Fitter RAM Summary**):
|
||
|
||
```
|
||
u_demo|u_vram|mem_rtl_0 Logical Size: 4194304 bits M20K blocks: 204.800
|
||
u_demo|u_vram|mem_rtl_1 Logical Size: 4194304 bits M20K blocks: 204.800
|
||
```
|
||
|
||
VRAM alone was eating ~410 of the 516 reported M20Ks. The
|
||
`vram_bram_stub` module has **1 write + 2 independent read ports**:
|
||
|
||
- **read** — PCRTC scanout (every pixel)
|
||
- **read2** — PSMT4 RMW old-byte read (rasterizer write path)
|
||
|
||
An M20K block has at most two physical ports total, and at most
|
||
one write port. To honour 1W + 2R, Quartus **replicates** the
|
||
entire 512 KiB storage into two simple-dual-port (1W + 1R) banks
|
||
with the write fanned to both copies. True dual-port would not
|
||
help — TDP gives two physical ports, not three.
|
||
|
||
The Ch251 hardware build draws PSMCT32 sprites only (color bars +
|
||
border + corners + heartbeat). The PSMT4 RMW pipe is wired but
|
||
never fires (`is_t4_emit` stays low for the entire frame), so the
|
||
second read port is dead weight on hardware.
|
||
|
||
Fix: parameterize `vram_bram_stub` with `ENABLE_READ2`. Default 1
|
||
keeps every simulation TB byte-identical (PSMT4 paths still
|
||
exercise). The DE25 board top overrides to 0, gating away the
|
||
`mem[read2_word_idx]` reference in a generate-if branch so
|
||
Quartus collapses the storage from two replicas (~410 M20Ks) to
|
||
one (~205 M20Ks). The PSMT4 RMW path still compiles inside the
|
||
wrapper (Ch157 logic intact) but feeds tied-zero `read2_data`,
|
||
which is harmless because PSMCT32 emits never consult it.
|
||
|
||
This is a **scoped hardware-demo build profile**, not a general
|
||
fix. A formal decision record at
|
||
[`docs/decisions/0006-vram-roadmap.md`](../decisions/0006-vram-roadmap.md)
|
||
captures both the Ch251.4 rescue and the longer-term architectural
|
||
follow-up (arbitrated TDP VRAM scheduler or line-buffered scanout)
|
||
that must land before broader GS format coverage returns to the
|
||
hardware build.
|
||
|
||
Files touched:
|
||
- [`rtl/gif_gs/vram_bram_stub.sv`](../../rtl/gif_gs/vram_bram_stub.sv) —
|
||
new `ENABLE_READ2` parameter, read2 always_ff gated by
|
||
`generate-if` so the synthesizer never sees the second
|
||
`mem[]` read when disabled.
|
||
- [`rtl/top/top_psmct32_raster_demo_bram.sv`](../../rtl/top/top_psmct32_raster_demo_bram.sv) —
|
||
new `VRAM_ENABLE_READ2` parameter (default 1) passed through to
|
||
`u_vram`.
|
||
- [`rtl/top/de25_nano_psmct32_raster_demo_top.sv`](../../rtl/top/de25_nano_psmct32_raster_demo_top.sv) —
|
||
overrides `VRAM_ENABLE_READ2(1'b0)` with the explicit rationale
|
||
inline.
|
||
- [`docs/decisions/0006-vram-roadmap.md`](../decisions/0006-vram-roadmap.md) —
|
||
new decision record.
|
||
|
||
Lesson learned (logged to MEMORY.md): on a Quartus fit failure,
|
||
**read the Fitter RAM Summary in `fit.rpt` BEFORE proposing
|
||
fixes.** Four rounds of patching `ee_ram_stub` chased the wrong
|
||
module because the wrapper's most recently changed signal was the
|
||
RAM write path; the resource report cleanly fingered VRAM in one
|
||
pass. The "Resource Utilization by Entity" report in
|
||
`Compilation Report → Fitter → Place Stage` is the right entry
|
||
point.
|
||
|
||
#### Ch252 — VRAM architecture checkpoint (docs + tripwire)
|
||
|
||
After Ch251 closed the visual milestone, the next chapter is
|
||
deliberately **not** a feature — it's an architecture checkpoint to
|
||
keep the M20K trade-off from drifting silently as the GS path grows.
|
||
The Ch251.4 read2-strip works because the demo doesn't need PSMT4
|
||
RMW; any later chapter that quietly relaxes that assumption would
|
||
re-introduce the replication pressure that blew the fitter.
|
||
|
||
**Hardware build profile snapshot** (de25_nano_psmct32_raster_demo):
|
||
|
||
| Item | Hardware build | Sim defaults |
|
||
|---------------------------------|------------------------|--------------------------------|
|
||
| `VRAM_BYTES` | 512 KiB | 8 KiB (wrapper default) |
|
||
| `VRAM_ENABLE_READ2` | `1'b0` | `1'b1` (default — PSMT4 live) |
|
||
| `vram_bram_stub` M20K cost | ~205 (one 1W+1R bank) | trivial (small + replicated) |
|
||
| Active GS formats | PSMCT32 | PSMCT32 / CT16 / T8 / T4 |
|
||
| PSMT4 RMW path (`is_t4_emit`) | wired, never fires | exercised by gs_pcrtc / xfer TBs |
|
||
|
||
The sim profile keeps every PSMT4-exercising TB byte-identical with
|
||
the pre-Ch251.4 behaviour because the wrapper default for `BYTES`
|
||
(8 KiB) is small enough that two replicas cost a handful of M20Ks
|
||
total — irrelevant in simulation.
|
||
|
||
**Elaboration tripwire.** A `$fatal` guard inside
|
||
[`vram_bram_stub.sv`](../../rtl/gif_gs/vram_bram_stub.sv)'s
|
||
`translate_off`'d `initial` block fires when both:
|
||
|
||
```
|
||
ENABLE_READ2 == 1'b1 && BYTES >= 262_144 (256 KiB)
|
||
```
|
||
|
||
256 KiB is not a magic number — it is the size above which each
|
||
1W+1R replica costs ~100 M20Ks, so replication suddenly becomes a
|
||
board-level architectural decision instead of a casual parameter
|
||
flip. The tripwire is a loud canary in iverilog / Verilator / lint;
|
||
the **real protection remains the board-top parameter profile**.
|
||
A future hardware build that wants both read2 and a large VRAM has
|
||
to either disable the guard intentionally or land one of the
|
||
architectural follow-ups first.
|
||
|
||
**Long-term triggers** for revisiting the architecture are tracked
|
||
in [`docs/decisions/0006-vram-roadmap.md`](../decisions/0006-vram-roadmap.md):
|
||
|
||
1. PSMT4 RMW returning to the rasterizer write path on hardware.
|
||
2. More than one VRAM read client during scanout (a second
|
||
simultaneous read consumer recreates the 1W+nR replication
|
||
shape).
|
||
3. `VRAM_BYTES` growing beyond the current 512 KiB profile.
|
||
|
||
Until one of those fires, the Ch251.4 + Ch251.5 combination is the
|
||
hardware build. Ch252 adds no RTL behaviour change on the active
|
||
path — only the sim-side tripwire, the decision-doc trigger list,
|
||
and this profile snapshot.
|
||
|
||
Files touched in Ch252:
|
||
|
||
- [`rtl/gif_gs/vram_bram_stub.sv`](../../rtl/gif_gs/vram_bram_stub.sv) —
|
||
new `$display`-then-`$fatal` guard inside the existing
|
||
`translate_off`'d `initial` block.
|
||
- [`docs/decisions/0006-vram-roadmap.md`](../decisions/0006-vram-roadmap.md) —
|
||
new "Triggers — when to revisit" subsection with the three
|
||
explicit triggers and the tripwire definition.
|
||
- This bringup section.
|
||
|
||
#### Ch251 addendum — PS2 backend OSD no longer force-opens
|
||
|
||
Initial Ch251 hardware test surfaced a UX problem: the
|
||
`ps2_backend.c::ps2_start()` was setting
|
||
`OSD_CTRL_ENABLE | OSD_CTRL_INPUT_LOCK | OSD_CTRL_FORCE_OPEN`
|
||
unconditionally (a holdover from when the PS2 core had no
|
||
"running content"), so the OSD covered the entire vertical
|
||
center of the screen including the new heartbeat region.
|
||
|
||
Fix in `retroDE_splash/software/ps2_backend.c`: drop
|
||
`OSD_CTRL_INPUT_LOCK | OSD_CTRL_FORCE_OPEN`, leave only
|
||
`OSD_CTRL_ENABLE`. Rationale: the Ch251 animated demo IS the
|
||
PS2 core's running content (color bars + heartbeat), so it
|
||
should be treated like a running game — OSD opt-in via the
|
||
Select+Start combo (Ch248 menu FSM), not force-opened over
|
||
content. If a future variant of `ps2_backend` truly has no
|
||
running content, that variant can re-add the FORCE_OPEN +
|
||
INPUT_LOCK flags like `splash_backend` does.
|
||
|
||
Header docstring + the inline ctrl-write comment updated to
|
||
match. Operator behavior:
|
||
|
||
- Boot into PS2 core → full demo visible (bars + border +
|
||
corners + center heartbeat blinking 1 Hz).
|
||
- Hold **Select+Start** ~0.5 s → OSD menu opens over the demo
|
||
(gamepad or keyboard nav, Ch248/Ch245 paths intact).
|
||
- Pick a core or hold Select+Start again → menu closes, demo
|
||
visible again.
|
||
|
||
This is a `retroDE_splash` HPS-side change only — no RTL
|
||
delta, no regression impact. User must cross-compile retrodesd
|
||
and copy the new binary to the board.
|
||
|
||
#### What's NOT in Ch251
|
||
|
||
- **No input-driven element.** Codex's framing offered an optional
|
||
Ch250-input toggle but explicitly noted "don't make input
|
||
mandatory." The 1 Hz heartbeat carries the "alive" indicator on
|
||
its own; coupling it to gamepad would mean the demo only animates
|
||
when a button is held, which is a worse default.
|
||
- **No analog/sweep effects.** The heartbeat just toggles between
|
||
two static colors. A traveling block or color cycle could be a
|
||
follow-up but adds bootlet complexity for marginal demo value.
|
||
- **No DISPLAY1 size change.** Demo still paints 320×240 inside the
|
||
640×480 active picture, OSD region still configured for 640×480
|
||
(Ch247). The Ch243 region-truncation arc is unaffected.
|
||
|
||
#### Ch253 — Known-good Ch251+ field-test checklist
|
||
|
||
After Ch251 / Ch251.4 / Ch251.5 / Ch252 closed the visible-on-silicon
|
||
milestone with the M20K profile locked, this is the single checklist
|
||
to run any time a fresh build flashes to the DE25-Nano. Two parts:
|
||
the visual checks an operator does at the monitor and the script-
|
||
verifiable readback via `ps2_status.sh --delta`.
|
||
|
||
**Visual (at the monitor / gamepad):**
|
||
|
||
| Check | Expected behaviour |
|
||
|----------------------------------------|-------------------------------------------------------------|
|
||
| 8 vertical color bars | white / yellow / cyan / green / magenta / red / blue / black |
|
||
| 4-pixel grey border | visible around the full 320×240 active region |
|
||
| 4 orange corner squares | one in each corner over the border |
|
||
| Center heartbeat | 16×16 square at (152,112) toggling **cyan ↔ red at ~0.5 Hz** (~2 s per color, Ch251.5; not 1 Hz — see note below) |
|
||
| OSD opens with Select+Start | hold ~0.5 s on a wired DS2 pad → platform OSD menu appears |
|
||
| OSD navigates with D-pad + A/B | up/down moves cursor, A picks core, B closes |
|
||
| OSD closes with Select+Start | second hold returns to the demo |
|
||
| HDMI output stable | no rolling, no NACK retries, no resync flashes |
|
||
| **Ch255 — heartbeat override (A/B)** | hold ○ (A) → next heartbeat redraws **red**; hold × (B) → redraws **cyan**; hold both → invert current color. Release → resumes EE-animated cyan↔red within ≤2 s |
|
||
|
||
**Script (`./ps2_status.sh --delta` on HPS Linux):**
|
||
|
||
```
|
||
Ch251+ animated-demo health verdict:
|
||
[ ✓ ] PCRTC alive (FRAME_COUNT Δ ≈ 120 over 2 s)
|
||
[ ✓ ] Raster healthy (RASTER_OVERFLOW_COUNT stable, bit[6] clear)
|
||
[ ✓ ] DMAC repaint liveness (DMA_DONE Δ ∈ {1, 2} over 2 s; bootlet animating)
|
||
[ ✓ ] EE core not halted (CORE_STATUS[1] = 0)
|
||
[ ✓ ] HDMI I²C clean (CORE_STATUS[5] = 0)
|
||
[ ✓ ] DS2 controller plugged (DS2_STATUS[0] = 1)
|
||
|
||
──> Ch251+ field health: PASS
|
||
```
|
||
|
||
Any `✗` in the verdict block means the unit failed bring-up. Triage
|
||
by section: the `Counters Δ` block above the verdict shows raw
|
||
numbers; the `CORE_STATUS` bit decode above that names the latched
|
||
fault bit.
|
||
|
||
A `[ ? ] DMAC repaint window miss` line is **not** a failure — the
|
||
2 s sample window can land entirely inside one color phase when the
|
||
~2 s toggle period and the script timing happen to align. Rerun
|
||
`./ps2_status.sh --delta` once; the verdict comes back PASS on the
|
||
next pass. Two or three consecutive `Δ=0` runs is the real "bootlet
|
||
loop dead" signal.
|
||
|
||
#### Ch254 — Heartbeat cadence characterization
|
||
|
||
The heartbeat is a **liveness cue, not a precision timer.** Ch254
|
||
locks the empirical model of why it lands where it does and frames
|
||
the operator-facing expectations accordingly:
|
||
|
||
**Per-toggle cycle (from hardware measurement at Ch251.5 + Ch253):**
|
||
|
||
```
|
||
total / toggle = delay_loop_time + fixed_overhead
|
||
≈ (DELAY_HI * 0x10000) × 14 cyc / 50 MHz + ~1.2 s
|
||
```
|
||
|
||
Solving the model from two data points
|
||
(`DELAY_HI=0x100` → 6 s, `DELAY_HI=0x002B` → ~2 s) gives
|
||
**cyc/iter ≈ 14, overhead ≈ 1.2 s**. The ~1.2 s overhead is the
|
||
DMAC drain of 102 qwords + the GS rasterization of all 17 SPRITEs +
|
||
the CHCR poll + the re-arm sequence — all of which run *after* the
|
||
delay-loop completes in each iteration.
|
||
|
||
**Why we don't chase a true 1 Hz:** the overhead is a hard floor.
|
||
Even with `DELAY_HI = 0` the bootlet caps at ~0.8 Hz toggle rate.
|
||
Going faster requires restructuring the bootlet (let the delay
|
||
run *during* the drain instead of after it, or shrink the
|
||
17-SPRITE payload). That is deliberately out of scope here — the
|
||
demo's job is to confirm liveness on silicon, not deliver a clock
|
||
signal. Ch254 ships the model honestly and stops there.
|
||
|
||
**Operator-facing expectation:** the heartbeat toggles roughly
|
||
every ~2 s (~0.5 Hz), with ±0.5 s jitter from overhead variation.
|
||
The validated `DMA_DONE` Δ band over a 2 s sample window is
|
||
**{0, 1, 2}**. `ps2_status.sh --delta` enforces exactly this band.
|
||
Δ outside that range is the operator's flag to investigate; Δ=0
|
||
on a single run is a phase-miss and gets a rerun.
|
||
|
||
Locked values:
|
||
- `DELAY_HI = 0x002B` in
|
||
[`sim/data/top_psmct32_raster_demo/bake.py`](../../sim/data/top_psmct32_raster_demo/bake.py)
|
||
- Validated band `0–2` in
|
||
[`docs/hardware/ps2_status.sh`](ps2_status.sh)
|
||
- `vram_bram_stub` profile unchanged from Ch252.
|
||
|
||
No RTL change. No bitstream change (the constant is locked at the
|
||
value already shipping). Ch254 is **characterization, not retune**.
|
||
|
||
#### Ch255 — Controller input drives the heartbeat color
|
||
|
||
After Ch254 closed cadence characterization, Ch255 is the first
|
||
chapter to put controller input on the demo's visible surface
|
||
without touching the OSD path or the DS2 controller itself. The
|
||
existing `INPUT_P1_RAW` bridge register that retrodesd's
|
||
`ds2_poll_thread` was already writing gets a new fabric consumer:
|
||
two of its bits feed a wrapper-side mux that overrides the
|
||
heartbeat color the splicer injects into the DMAC read response.
|
||
|
||
**Mapping** (matches the established retroDE INPUT_P1 bit layout
|
||
in `rtl/iop/sio2_input_stub.sv`):
|
||
|
||
- `INPUT_P1_RAW[9]` (Sony ○ / JOY_A) pressed → force RED
|
||
(`0xFF0000FF`).
|
||
- `INPUT_P1_RAW[7]` (Sony × / JOY_B) pressed → force CYAN
|
||
(`0xFFFFFF00`).
|
||
- Both pressed → invert the EE's current `hb_rgbaq_reg` value
|
||
(XOR with `0x00FFFFFF`, swaps cyan↔red).
|
||
- Neither pressed → EE's animated cyan↔red toggle passes through
|
||
unchanged.
|
||
|
||
The EE bootlet is **untouched** — it keeps owning `hb_rgbaq_reg`
|
||
via the Ch251.3 SW-to-`0x0000_0730` splicer path. The override is
|
||
a pure combinational mux in front of the splicer, so the EE's
|
||
animation is preserved between button presses and resumes
|
||
immediately on release.
|
||
|
||
**Response latency** is one DMAC drain cycle. The GS only repaints
|
||
the heartbeat sprite when the bootlet kicks DMAC channel 2, which
|
||
happens once per ~2 s loop iteration (Ch254). So a button press
|
||
takes effect on the NEXT heartbeat redraw — visible within ≤2 s,
|
||
typically ≤1 s. Sub-second response would require either a
|
||
fast-path GS draw triggered by button edges or a direct VRAM
|
||
poke at the heartbeat pixel coordinates — both deliberately out
|
||
of scope for Ch255's "input affects demo" proof.
|
||
|
||
**Hardware surface:**
|
||
|
||
- Same flashed `.rbf` continues to work for retrodesd
|
||
(`INPUT_P1_RAW` writes were already landing, just had no fabric
|
||
consumer for the heartbeat-relevant bits).
|
||
- No new MMIO registers, no new HPS code, no ABI changes.
|
||
- No change to OSD ctrl/status registers, no change to
|
||
`ds2_controller`, no change to `osd_menu_fsm`. Select+Start
|
||
still opens the OSD, gamepad still navigates it.
|
||
- No VRAM profile regression. M20K cost unchanged.
|
||
|
||
**Sim coverage** lives in
|
||
[`tb_top_psmct32_raster_demo_bram_ch171`](../../sim/tb/top/tb_top_psmct32_raster_demo_bram_ch171.sv).
|
||
After the existing first-DMAC pixel-pattern + heartbeat-qword
|
||
assertions, the TB sweeps all four `{joy_a, joy_b}` combinations,
|
||
asserts `hb_rgbaq_effective` for each, and confirms `hb_rgbaq_reg`
|
||
itself isn't corrupted by the override. The four assertions catch
|
||
mux priority regressions, the XOR-both-pressed case, and
|
||
ensure the EE's background animation is decoupled.
|
||
|
||
**Acceptance for hardware bring-up** (added to the Ch253 visual
|
||
checklist above):
|
||
|
||
- Press ○ → next heartbeat redraw shows red.
|
||
- Press × → next redraw shows cyan.
|
||
- Press both → next redraw shows the *inverse* of whatever the EE
|
||
was about to paint (visual feedback that the XOR fired even
|
||
when neither single override applies).
|
||
- Release → returns to EE-animated cyan↔red within ≤2 s.
|
||
- `ps2_status.sh --delta` still PASS (DMA_DONE keeps incrementing
|
||
— the override only changes what gets painted, not the loop
|
||
rate).
|
||
|
||
Files touched in Ch255:
|
||
|
||
- [`rtl/top/top_psmct32_raster_demo_bram.sv`](../../rtl/top/top_psmct32_raster_demo_bram.sv) —
|
||
two new 1-bit input ports (`joy_a_pressed_i`, `joy_b_pressed_i`)
|
||
+ four-line combinational `hb_rgbaq_effective` mux + splicer now
|
||
reads `hb_rgbaq_effective` instead of `hb_rgbaq_reg`.
|
||
- [`rtl/top/de25_nano_psmct32_raster_demo_top.sv`](../../rtl/top/de25_nano_psmct32_raster_demo_top.sv) —
|
||
wires `bridge_input_p1_raw[9]` / `[7]` into the new wrapper
|
||
ports.
|
||
- [`sim/tb/top/tb_top_psmct32_raster_demo_bram_ch171.sv`](../../sim/tb/top/tb_top_psmct32_raster_demo_bram_ch171.sv) —
|
||
new override coverage sweep.
|
||
- [`sim/tb/top/tb_gs_raster_backpressure_stress.sv`](../../sim/tb/top/tb_gs_raster_backpressure_stress.sv) —
|
||
joy inputs tied to `1'b0` (override dormant, regression byte-
|
||
identical).
|
||
|
||
**Build-time (one-time check per bitstream, not per boot):**
|
||
|
||
- `de25_nano_psmct32_raster_demo_top.sv` overrides
|
||
`VRAM_ENABLE_READ2 = 1'b0` (Ch252 hardware profile). Verify by
|
||
grepping the synth top before flashing if you suspect a misbuild:
|
||
```
|
||
grep VRAM_ENABLE_READ2 rtl/top/de25_nano_psmct32_raster_demo_top.sv
|
||
```
|
||
Expected: `.VRAM_ENABLE_READ2 (1'b0)`. Any other value with
|
||
`VRAM_BYTES = 512 KiB` blows the Agilex 5 M20K budget; see
|
||
[`docs/decisions/0006-vram-roadmap.md`](../decisions/0006-vram-roadmap.md).
|
||
|
||
If everything in all three sections is green, the unit is at the
|
||
Ch251+ baseline and ready for the next architectural chapter.
|
||
|
||
## Known caveats (deferred to Ch174+)
|
||
|
||
- **HDMI output timing**: `set_false_path -to HDMI_TX_*` placeholder.
|
||
Real `set_output_delay` against ADV7513 setup/hold is on the
|
||
Ch168+ list.
|
||
- **No audio**: I²C config touches audio registers (MCLK = 12.288 MHz,
|
||
I²S 48 kHz / 16-bit / stereo) but no I²S data is generated. Audio
|
||
streams will silently underrun.
|
||
- **No LPDDR4 / SDRAM / HPS**: unused by the demo. EE/IOP RAMs are
|
||
on-FPGA BRAM.
|
||
|
||
## What's verified pre-silicon
|
||
|
||
The accelerated bring-up TB
|
||
([`sim/tb/top/tb_hdmi_i2c_wake_smoke.sv`](../../sim/tb/top/tb_hdmi_i2c_wake_smoke.sv))
|
||
locks down the I²C wake-up sequence at simulation time. As of
|
||
Ch167 it asserts:
|
||
|
||
- LUT walk reaches all 38 entries; `READY` (`hdmi_init_done`)
|
||
rises after the walk completes.
|
||
- `HDMI_TX_INT` low pulse retriggers the LUT walk; `READY`
|
||
re-asserts on the second walk.
|
||
- SDA is always `1'b0`, `1'b1`, or `1'bz` — never `1'bx`
|
||
(would indicate two drivers contending for strong-HIGH, an
|
||
open-drain violation).
|
||
- The Ch166 NACK watchdog stays LOW on a healthy bus and
|
||
rises (sticky) when the slave doesn't ACK.
|
||
- **Byte-sequence lock**: every one of the 38 transactions on
|
||
the simulated wire matches the FSM-intent payload
|
||
byte-for-byte, and every transaction's slave address is
|
||
`8'h72` (ADV7513 write address). If a future RTL change
|
||
alters either the LUT order, the LUT contents, or the
|
||
slave address, the TB fails.
|
||
|
||
## Reference
|
||
|
||
- RTL banner: `rtl/top/de25_nano_psmct32_raster_demo_top.sv`
|
||
- I²C wake-up FSM: `rtl/platform/I2C_HDMI_Config.v`
|
||
- I²C bit-bang master: `rtl/platform/I2C_Controller.v`
|
||
- Pin assignments + IO standards:
|
||
`synth/de25_nano/top_psmct32_raster_demo/de25_nano_psmct32_raster_demo_top.qsf`
|
||
- Timing constraints:
|
||
`synth/de25_nano/top_psmct32_raster_demo/de25_nano_psmct32_raster_demo_top.sdc`
|
||
- Accelerated bring-up TB:
|
||
`sim/tb/top/tb_hdmi_i2c_wake_smoke.sv`
|