Initial commit: retroDE_ps2 — first-of-its-kind PS2 GS FPGA core (DE25-Nano / Agilex 5)

RTL (GS rasterizer, EE core stub, platform bridge, LPDDR4B path), sim regression
(272 TBs), docs, and tooling. Copyrighted PS2 content (BIOS, game code, GS dumps,
and all dump-derived textures/traces) is excluded via .gitignore and stays local.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-29 20:10:50 -04:00
commit ec82764bef
2462 changed files with 2174303 additions and 0 deletions
+365
View File
@@ -0,0 +1,365 @@
# Wave 2.5 Mini-Plan: Memory-Backed DMAC Payload
This document defines the consolidation step between Wave 2's
`programmable BGCOLOR via DMA/GIF` proof and later expansion into wider DMAC,
GIF, and EE-side behavior.
Goal:
- retire the biggest remaining Wave 2 shortcut,
- make `MADR` real instead of trace-only,
- keep the current Milestone A+ proof intact,
- improve ownership boundaries before expanding into SIF/IOP or wider GIF work.
Working milestone name:
`memory-backed BGCOLOR via DMA/GIF`
That means:
- DMAC channel 2 is programmed with a real source address,
- DMAC fetches payload from an addressable memory block,
- GIF receives the fetched payload,
- GS updates `BGCOLOR`,
- platform video reflects the new color,
- traces show the fetch path as well as the transfer path.
## Why this goes before SIF/IOP
The current Wave 2 path is structurally good but still has one major artificial
seam:
- the testbench injects payload directly into DMAC.
Before opening the EE↔IOP front, the design benefits more from tightening that
EE-side ownership seam than from starting a second subsystem family.
This step:
- makes `MADR` meaningful,
- introduces a real read-master relationship,
- prepares the ground for multi-beat transfers,
- keeps the work inside the already-active DMAC/GIF/GS lane.
## Deliverables
The first Wave 2.5 pass should land:
1. `rtl/memory/ee_ram_stub.sv`
2. an updated `rtl/dmac/dmac_reg_stub.sv` with a memory-read master port
3. an updated `sim/tb/integration/tb_bgcolor_via_dma.sv`
4. any trace-schema refinements needed for DMAC fetch visibility
5. `Makefile` / `sim/README.md` updates if the run flow changes
## Ownership decision: memory source shape
Recommendation:
- add a new `rtl/memory/ee_ram_stub.sv`
Do not:
- extend `bios_rom_stub` into a read-write block,
- add a DMAC-private payload RAM that lives inside `rtl/dmac/`.
### Why `ee_ram_stub.sv`
This is the cleanest ownership boundary for future growth:
- memory remains owned by the memory subsystem,
- DMAC becomes a memory client, not a memory owner,
- `ee_memory_map_stub` can grow toward multiple mapped regions without
refactoring a DMAC-private storage hack back out later.
It also tells the right architectural story:
- BIOS is ROM,
- EE RAM is RAM,
- DMAC reads from memory through an explicit master interface.
## `ee_ram_stub.sv` exact scope
Status target:
- tiny, addressable, read-first memory block for DMAC-backed tests
### Owns in this phase
- a small read/write memory array,
- byte-addressed external interface,
- one-cycle read latency,
- optional simple write port for testbench preload,
- trace events for read and write activity if useful.
### Explicit non-goals
- full 32 MiB EE RAM sizing,
- cache behavior,
- arbitration among multiple masters,
- full integration into the EE-visible memory map,
- timing fidelity beyond a simple deterministic latency.
### Recommended size
Use something small but qword-friendly:
- default `SIZE_BYTES = 4 KiB` or `16 KiB`
That is large enough for:
- multiple DMAC packets,
- future directed tests,
- no simulator pain.
### Recommended interface
Keep it simple and aligned with the current stub ecosystem.
Suggested external interface:
- read request:
- `rd_en`
- `rd_addr`
- `rd_data`
- `rd_valid`
- write request:
- `wr_en`
- `wr_addr`
- `wr_data`
- `wr_be`
Recommended data width:
- `128-bit` data path for read and write
Reason:
- aligns with current DMAC qword semantics,
- avoids extra packing/unpacking for the first memory-backed DMA proof,
- keeps the first step focused on ownership, not bus-width adaptation.
Addressing:
- qword-aligned externally
- low address bits ignored as needed
### Preload mechanism
For the first implementation, allow either:
- testbench writes through the normal write port, or
- optional `$readmemh` preload file parameter
Recommendation:
- use the normal write port in the integration testbench first
Why:
- makes the preload path explicit and testable,
- avoids adding file-format work unless it becomes useful later.
## DMAC fetch interface
The direct payload input should be retired from the integrated proof path.
Recommended new DMAC-side interface:
- `mem_rd_en`
- `mem_rd_addr`
- `mem_rd_data`
- `mem_rd_valid`
That is enough for the first qword-fetch path.
### Ownership and routing
For Wave 2.5:
- connect `dmac_reg_stub` directly to `ee_ram_stub`
Do not route the fetch through `ee_memory_map_stub` yet.
Reason:
- `ee_memory_map_stub` currently only decodes BIOS and unmapped space,
- forcing DMAC through it now would either require a larger memory-map rewrite
or add a fake bypass anyway,
- direct connection still preserves the key ownership boundary:
DMAC is a memory client and RAM is a memory block.
This should be documented as a temporary topology, not the final architecture.
## `MADR` behavior in Wave 2.5
`MADR` stops being "recorded but ignored" and becomes the actual qword fetch
source address.
Recommended behavior:
- on `DMA_START`, latch `MADR`
- issue one or more qword reads starting at that address
- increment by `16` bytes per beat
Trace consequence:
- DMAC trace should show the fetch source address clearly
- memory trace should show the corresponding RAM read(s)
## QWC scope
Recommendation:
- support `QWC == 1` in the first Wave 2.5 implementation
- make the internal design compatible with `QWC > 1`
- do not require multi-beat support for initial signoff
Reason:
- the point of 2.5 is memory-backed ownership, not throughput expansion
- `QWC > 1` becomes more valuable once the fetch path is stable
Suggested follow-up:
- call `QWC > 1` a `Wave 2.6` or `Wave 3a` extension unless it falls out
naturally with low risk
## Trace schema additions / refinements
Wave 2 already added:
- `EV_DMA_CFG`
- `EV_DMA_START`
- `EV_DMA_BEAT`
- `EV_DMA_DONE`
- `EV_GIFTAG`
- `EV_GS_WRITE`
Wave 2.5 does not require new event names if the payloads are refined well.
Recommended refinement:
### `DMAC EV_DMA_START`
- `arg0 = channel`
- `arg1 = qwc`
- `arg2 = MADR`
- `arg3 = path id`
This is more useful now that `MADR` matters.
### `DMAC EV_DMA_BEAT`
- `arg0 = channel`
- `arg1 = beat index`
- `arg2 = source address for this beat`
- `arg3 = remaining count`
### `MEM READ` from `ee_ram_stub`
Reuse existing `MEM READ` event shape:
- `arg0 = address`
- `arg1 = low data summary or full qword-low summary`
- `arg2 = master id`
- `arg3 = region id`
Suggested ids for this phase:
- master id: `1 = DMAC`
- region id: `1 = EE_RAM`
This lets traces show the DMAC→RAM→GIF ownership chain without inventing a new
event family.
## `dmac_reg_stub` internal changes
Recommended state flow:
- `IDLE`
- `FETCH_WAIT`
- `ACTIVE_SEND`
- `DONE`
Suggested behavior:
1. CPU/testbench writes `MADR`, `QWC`, `CHCR`
2. `DMA_START`
3. issue memory read at `MADR + beat_index * 16`
4. wait for `mem_rd_valid`
5. present fetched qword to GIF
6. on accept, increment beat count
7. if beats remain, fetch next qword
8. otherwise `DMA_DONE`
### Guardrail
Do not keep both the direct payload port and the memory-fetch path active in
the integrated milestone as equal options.
If a compatibility/debug path is retained temporarily, it must be:
- clearly labeled as deprecated,
- unused by the main integration testbench.
## Integration testbench migration
`tb_bgcolor_via_dma.sv` should stop driving the direct payload port and instead:
1. reset the chain
2. preload one qword into `ee_ram_stub` through its write port
3. program `MADR` to that location
4. program `QWC = 1`
5. trigger `CHCR.start`
6. wait for `DMA_DONE`
7. verify the same visible color-change outcome as before
### Payload preload mechanism
Recommended first mechanism:
- direct writes into `ee_ram_stub` through its write port from the testbench
This keeps the setup explicit and simple.
## Integrated testbench pass criteria
Keep the current Wave 2 pass criteria stable, with the following additions:
- at least one `MEM READ` from `ee_ram_stub` attributable to DMAC
- `DMA_START` trace includes a meaningful `MADR`
- `DMA_BEAT` source-address trace lines match the programmed `MADR`
Suggested summary line:
```text
[tb_bgcolor_via_dma] mem_reads=1 dma_start=1 dma_done=1 giftag=1 gs_bgcolor=1 bg=(ff,00,00) errors=0
```
## Recommended implementation order
1. add `ee_ram_stub.sv`
2. update `dmac_reg_stub.sv` to fetch from memory
3. update `tb_bgcolor_via_dma.sv` to preload RAM and drive `MADR`
4. adjust trace payload details if needed
5. wire docs / Makefile updates
## Guardrails
- Do not route DMAC through `ee_memory_map_stub` yet unless Claude wants to
deliberately broaden scope.
- Do not broaden to `QWC > 1` unless it falls out naturally.
- Do not let the preload mechanism become a hidden debug-only side channel.
- Do not erase the existing Wave 2 trace visibility while refactoring the fetch
path.
## Exit criteria for mini-plan acceptance
This mini-plan is accepted when Claude can implement:
1. a new `ee_ram_stub.sv`,
2. a memory-master version of `dmac_reg_stub`,
3. an updated `tb_bgcolor_via_dma.sv` that preloads RAM rather than driving the
direct payload port,
4. while preserving the visible Milestone A+ outcome and adding trace-visible
proof that `MADR` is the actual payload source.