Initial commit: retroDE_ps2 — first-of-its-kind PS2 GS FPGA core (DE25-Nano / Agilex 5)
RTL (GS rasterizer, EE core stub, platform bridge, LPDDR4B path), sim regression (272 TBs), docs, and tooling. Copyrighted PS2 content (BIOS, game code, GS dumps, and all dump-derived textures/traces) is excluded via .gitignore and stays local. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,365 @@
|
||||
# Wave 2.5 Mini-Plan: Memory-Backed DMAC Payload
|
||||
|
||||
This document defines the consolidation step between Wave 2's
|
||||
`programmable BGCOLOR via DMA/GIF` proof and later expansion into wider DMAC,
|
||||
GIF, and EE-side behavior.
|
||||
|
||||
Goal:
|
||||
|
||||
- retire the biggest remaining Wave 2 shortcut,
|
||||
- make `MADR` real instead of trace-only,
|
||||
- keep the current Milestone A+ proof intact,
|
||||
- improve ownership boundaries before expanding into SIF/IOP or wider GIF work.
|
||||
|
||||
Working milestone name:
|
||||
|
||||
`memory-backed BGCOLOR via DMA/GIF`
|
||||
|
||||
That means:
|
||||
|
||||
- DMAC channel 2 is programmed with a real source address,
|
||||
- DMAC fetches payload from an addressable memory block,
|
||||
- GIF receives the fetched payload,
|
||||
- GS updates `BGCOLOR`,
|
||||
- platform video reflects the new color,
|
||||
- traces show the fetch path as well as the transfer path.
|
||||
|
||||
## Why this goes before SIF/IOP
|
||||
|
||||
The current Wave 2 path is structurally good but still has one major artificial
|
||||
seam:
|
||||
|
||||
- the testbench injects payload directly into DMAC.
|
||||
|
||||
Before opening the EE↔IOP front, the design benefits more from tightening that
|
||||
EE-side ownership seam than from starting a second subsystem family.
|
||||
|
||||
This step:
|
||||
|
||||
- makes `MADR` meaningful,
|
||||
- introduces a real read-master relationship,
|
||||
- prepares the ground for multi-beat transfers,
|
||||
- keeps the work inside the already-active DMAC/GIF/GS lane.
|
||||
|
||||
## Deliverables
|
||||
|
||||
The first Wave 2.5 pass should land:
|
||||
|
||||
1. `rtl/memory/ee_ram_stub.sv`
|
||||
2. an updated `rtl/dmac/dmac_reg_stub.sv` with a memory-read master port
|
||||
3. an updated `sim/tb/integration/tb_bgcolor_via_dma.sv`
|
||||
4. any trace-schema refinements needed for DMAC fetch visibility
|
||||
5. `Makefile` / `sim/README.md` updates if the run flow changes
|
||||
|
||||
## Ownership decision: memory source shape
|
||||
|
||||
Recommendation:
|
||||
|
||||
- add a new `rtl/memory/ee_ram_stub.sv`
|
||||
|
||||
Do not:
|
||||
|
||||
- extend `bios_rom_stub` into a read-write block,
|
||||
- add a DMAC-private payload RAM that lives inside `rtl/dmac/`.
|
||||
|
||||
### Why `ee_ram_stub.sv`
|
||||
|
||||
This is the cleanest ownership boundary for future growth:
|
||||
|
||||
- memory remains owned by the memory subsystem,
|
||||
- DMAC becomes a memory client, not a memory owner,
|
||||
- `ee_memory_map_stub` can grow toward multiple mapped regions without
|
||||
refactoring a DMAC-private storage hack back out later.
|
||||
|
||||
It also tells the right architectural story:
|
||||
|
||||
- BIOS is ROM,
|
||||
- EE RAM is RAM,
|
||||
- DMAC reads from memory through an explicit master interface.
|
||||
|
||||
## `ee_ram_stub.sv` exact scope
|
||||
|
||||
Status target:
|
||||
|
||||
- tiny, addressable, read-first memory block for DMAC-backed tests
|
||||
|
||||
### Owns in this phase
|
||||
|
||||
- a small read/write memory array,
|
||||
- byte-addressed external interface,
|
||||
- one-cycle read latency,
|
||||
- optional simple write port for testbench preload,
|
||||
- trace events for read and write activity if useful.
|
||||
|
||||
### Explicit non-goals
|
||||
|
||||
- full 32 MiB EE RAM sizing,
|
||||
- cache behavior,
|
||||
- arbitration among multiple masters,
|
||||
- full integration into the EE-visible memory map,
|
||||
- timing fidelity beyond a simple deterministic latency.
|
||||
|
||||
### Recommended size
|
||||
|
||||
Use something small but qword-friendly:
|
||||
|
||||
- default `SIZE_BYTES = 4 KiB` or `16 KiB`
|
||||
|
||||
That is large enough for:
|
||||
|
||||
- multiple DMAC packets,
|
||||
- future directed tests,
|
||||
- no simulator pain.
|
||||
|
||||
### Recommended interface
|
||||
|
||||
Keep it simple and aligned with the current stub ecosystem.
|
||||
|
||||
Suggested external interface:
|
||||
|
||||
- read request:
|
||||
- `rd_en`
|
||||
- `rd_addr`
|
||||
- `rd_data`
|
||||
- `rd_valid`
|
||||
- write request:
|
||||
- `wr_en`
|
||||
- `wr_addr`
|
||||
- `wr_data`
|
||||
- `wr_be`
|
||||
|
||||
Recommended data width:
|
||||
|
||||
- `128-bit` data path for read and write
|
||||
|
||||
Reason:
|
||||
|
||||
- aligns with current DMAC qword semantics,
|
||||
- avoids extra packing/unpacking for the first memory-backed DMA proof,
|
||||
- keeps the first step focused on ownership, not bus-width adaptation.
|
||||
|
||||
Addressing:
|
||||
|
||||
- qword-aligned externally
|
||||
- low address bits ignored as needed
|
||||
|
||||
### Preload mechanism
|
||||
|
||||
For the first implementation, allow either:
|
||||
|
||||
- testbench writes through the normal write port, or
|
||||
- optional `$readmemh` preload file parameter
|
||||
|
||||
Recommendation:
|
||||
|
||||
- use the normal write port in the integration testbench first
|
||||
|
||||
Why:
|
||||
|
||||
- makes the preload path explicit and testable,
|
||||
- avoids adding file-format work unless it becomes useful later.
|
||||
|
||||
## DMAC fetch interface
|
||||
|
||||
The direct payload input should be retired from the integrated proof path.
|
||||
|
||||
Recommended new DMAC-side interface:
|
||||
|
||||
- `mem_rd_en`
|
||||
- `mem_rd_addr`
|
||||
- `mem_rd_data`
|
||||
- `mem_rd_valid`
|
||||
|
||||
That is enough for the first qword-fetch path.
|
||||
|
||||
### Ownership and routing
|
||||
|
||||
For Wave 2.5:
|
||||
|
||||
- connect `dmac_reg_stub` directly to `ee_ram_stub`
|
||||
|
||||
Do not route the fetch through `ee_memory_map_stub` yet.
|
||||
|
||||
Reason:
|
||||
|
||||
- `ee_memory_map_stub` currently only decodes BIOS and unmapped space,
|
||||
- forcing DMAC through it now would either require a larger memory-map rewrite
|
||||
or add a fake bypass anyway,
|
||||
- direct connection still preserves the key ownership boundary:
|
||||
DMAC is a memory client and RAM is a memory block.
|
||||
|
||||
This should be documented as a temporary topology, not the final architecture.
|
||||
|
||||
## `MADR` behavior in Wave 2.5
|
||||
|
||||
`MADR` stops being "recorded but ignored" and becomes the actual qword fetch
|
||||
source address.
|
||||
|
||||
Recommended behavior:
|
||||
|
||||
- on `DMA_START`, latch `MADR`
|
||||
- issue one or more qword reads starting at that address
|
||||
- increment by `16` bytes per beat
|
||||
|
||||
Trace consequence:
|
||||
|
||||
- DMAC trace should show the fetch source address clearly
|
||||
- memory trace should show the corresponding RAM read(s)
|
||||
|
||||
## QWC scope
|
||||
|
||||
Recommendation:
|
||||
|
||||
- support `QWC == 1` in the first Wave 2.5 implementation
|
||||
- make the internal design compatible with `QWC > 1`
|
||||
- do not require multi-beat support for initial signoff
|
||||
|
||||
Reason:
|
||||
|
||||
- the point of 2.5 is memory-backed ownership, not throughput expansion
|
||||
- `QWC > 1` becomes more valuable once the fetch path is stable
|
||||
|
||||
Suggested follow-up:
|
||||
|
||||
- call `QWC > 1` a `Wave 2.6` or `Wave 3a` extension unless it falls out
|
||||
naturally with low risk
|
||||
|
||||
## Trace schema additions / refinements
|
||||
|
||||
Wave 2 already added:
|
||||
|
||||
- `EV_DMA_CFG`
|
||||
- `EV_DMA_START`
|
||||
- `EV_DMA_BEAT`
|
||||
- `EV_DMA_DONE`
|
||||
- `EV_GIFTAG`
|
||||
- `EV_GS_WRITE`
|
||||
|
||||
Wave 2.5 does not require new event names if the payloads are refined well.
|
||||
|
||||
Recommended refinement:
|
||||
|
||||
### `DMAC EV_DMA_START`
|
||||
|
||||
- `arg0 = channel`
|
||||
- `arg1 = qwc`
|
||||
- `arg2 = MADR`
|
||||
- `arg3 = path id`
|
||||
|
||||
This is more useful now that `MADR` matters.
|
||||
|
||||
### `DMAC EV_DMA_BEAT`
|
||||
|
||||
- `arg0 = channel`
|
||||
- `arg1 = beat index`
|
||||
- `arg2 = source address for this beat`
|
||||
- `arg3 = remaining count`
|
||||
|
||||
### `MEM READ` from `ee_ram_stub`
|
||||
|
||||
Reuse existing `MEM READ` event shape:
|
||||
|
||||
- `arg0 = address`
|
||||
- `arg1 = low data summary or full qword-low summary`
|
||||
- `arg2 = master id`
|
||||
- `arg3 = region id`
|
||||
|
||||
Suggested ids for this phase:
|
||||
|
||||
- master id: `1 = DMAC`
|
||||
- region id: `1 = EE_RAM`
|
||||
|
||||
This lets traces show the DMAC→RAM→GIF ownership chain without inventing a new
|
||||
event family.
|
||||
|
||||
## `dmac_reg_stub` internal changes
|
||||
|
||||
Recommended state flow:
|
||||
|
||||
- `IDLE`
|
||||
- `FETCH_WAIT`
|
||||
- `ACTIVE_SEND`
|
||||
- `DONE`
|
||||
|
||||
Suggested behavior:
|
||||
|
||||
1. CPU/testbench writes `MADR`, `QWC`, `CHCR`
|
||||
2. `DMA_START`
|
||||
3. issue memory read at `MADR + beat_index * 16`
|
||||
4. wait for `mem_rd_valid`
|
||||
5. present fetched qword to GIF
|
||||
6. on accept, increment beat count
|
||||
7. if beats remain, fetch next qword
|
||||
8. otherwise `DMA_DONE`
|
||||
|
||||
### Guardrail
|
||||
|
||||
Do not keep both the direct payload port and the memory-fetch path active in
|
||||
the integrated milestone as equal options.
|
||||
|
||||
If a compatibility/debug path is retained temporarily, it must be:
|
||||
|
||||
- clearly labeled as deprecated,
|
||||
- unused by the main integration testbench.
|
||||
|
||||
## Integration testbench migration
|
||||
|
||||
`tb_bgcolor_via_dma.sv` should stop driving the direct payload port and instead:
|
||||
|
||||
1. reset the chain
|
||||
2. preload one qword into `ee_ram_stub` through its write port
|
||||
3. program `MADR` to that location
|
||||
4. program `QWC = 1`
|
||||
5. trigger `CHCR.start`
|
||||
6. wait for `DMA_DONE`
|
||||
7. verify the same visible color-change outcome as before
|
||||
|
||||
### Payload preload mechanism
|
||||
|
||||
Recommended first mechanism:
|
||||
|
||||
- direct writes into `ee_ram_stub` through its write port from the testbench
|
||||
|
||||
This keeps the setup explicit and simple.
|
||||
|
||||
## Integrated testbench pass criteria
|
||||
|
||||
Keep the current Wave 2 pass criteria stable, with the following additions:
|
||||
|
||||
- at least one `MEM READ` from `ee_ram_stub` attributable to DMAC
|
||||
- `DMA_START` trace includes a meaningful `MADR`
|
||||
- `DMA_BEAT` source-address trace lines match the programmed `MADR`
|
||||
|
||||
Suggested summary line:
|
||||
|
||||
```text
|
||||
[tb_bgcolor_via_dma] mem_reads=1 dma_start=1 dma_done=1 giftag=1 gs_bgcolor=1 bg=(ff,00,00) errors=0
|
||||
```
|
||||
|
||||
## Recommended implementation order
|
||||
|
||||
1. add `ee_ram_stub.sv`
|
||||
2. update `dmac_reg_stub.sv` to fetch from memory
|
||||
3. update `tb_bgcolor_via_dma.sv` to preload RAM and drive `MADR`
|
||||
4. adjust trace payload details if needed
|
||||
5. wire docs / Makefile updates
|
||||
|
||||
## Guardrails
|
||||
|
||||
- Do not route DMAC through `ee_memory_map_stub` yet unless Claude wants to
|
||||
deliberately broaden scope.
|
||||
- Do not broaden to `QWC > 1` unless it falls out naturally.
|
||||
- Do not let the preload mechanism become a hidden debug-only side channel.
|
||||
- Do not erase the existing Wave 2 trace visibility while refactoring the fetch
|
||||
path.
|
||||
|
||||
## Exit criteria for mini-plan acceptance
|
||||
|
||||
This mini-plan is accepted when Claude can implement:
|
||||
|
||||
1. a new `ee_ram_stub.sv`,
|
||||
2. a memory-master version of `dmac_reg_stub`,
|
||||
3. an updated `tb_bgcolor_via_dma.sv` that preloads RAM rather than driving the
|
||||
direct payload port,
|
||||
4. while preserving the visible Milestone A+ outcome and adding trace-visible
|
||||
proof that `MADR` is the actual payload source.
|
||||
Reference in New Issue
Block a user