Files
retroDE_ps2/docs/wave25_memory_backed_dma_plan.md
T
thejayman77 ec82764bef Initial commit: retroDE_ps2 — first-of-its-kind PS2 GS FPGA core (DE25-Nano / Agilex 5)
RTL (GS rasterizer, EE core stub, platform bridge, LPDDR4B path), sim regression
(272 TBs), docs, and tooling. Copyrighted PS2 content (BIOS, game code, GS dumps,
and all dump-derived textures/traces) is excluded via .gitignore and stays local.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-29 20:10:50 -04:00

9.1 KiB

Wave 2.5 Mini-Plan: Memory-Backed DMAC Payload

This document defines the consolidation step between Wave 2's programmable BGCOLOR via DMA/GIF proof and later expansion into wider DMAC, GIF, and EE-side behavior.

Goal:

  • retire the biggest remaining Wave 2 shortcut,
  • make MADR real instead of trace-only,
  • keep the current Milestone A+ proof intact,
  • improve ownership boundaries before expanding into SIF/IOP or wider GIF work.

Working milestone name:

memory-backed BGCOLOR via DMA/GIF

That means:

  • DMAC channel 2 is programmed with a real source address,
  • DMAC fetches payload from an addressable memory block,
  • GIF receives the fetched payload,
  • GS updates BGCOLOR,
  • platform video reflects the new color,
  • traces show the fetch path as well as the transfer path.

Why this goes before SIF/IOP

The current Wave 2 path is structurally good but still has one major artificial seam:

  • the testbench injects payload directly into DMAC.

Before opening the EE↔IOP front, the design benefits more from tightening that EE-side ownership seam than from starting a second subsystem family.

This step:

  • makes MADR meaningful,
  • introduces a real read-master relationship,
  • prepares the ground for multi-beat transfers,
  • keeps the work inside the already-active DMAC/GIF/GS lane.

Deliverables

The first Wave 2.5 pass should land:

  1. rtl/memory/ee_ram_stub.sv
  2. an updated rtl/dmac/dmac_reg_stub.sv with a memory-read master port
  3. an updated sim/tb/integration/tb_bgcolor_via_dma.sv
  4. any trace-schema refinements needed for DMAC fetch visibility
  5. Makefile / sim/README.md updates if the run flow changes

Ownership decision: memory source shape

Recommendation:

  • add a new rtl/memory/ee_ram_stub.sv

Do not:

  • extend bios_rom_stub into a read-write block,
  • add a DMAC-private payload RAM that lives inside rtl/dmac/.

Why ee_ram_stub.sv

This is the cleanest ownership boundary for future growth:

  • memory remains owned by the memory subsystem,
  • DMAC becomes a memory client, not a memory owner,
  • ee_memory_map_stub can grow toward multiple mapped regions without refactoring a DMAC-private storage hack back out later.

It also tells the right architectural story:

  • BIOS is ROM,
  • EE RAM is RAM,
  • DMAC reads from memory through an explicit master interface.

ee_ram_stub.sv exact scope

Status target:

  • tiny, addressable, read-first memory block for DMAC-backed tests

Owns in this phase

  • a small read/write memory array,
  • byte-addressed external interface,
  • one-cycle read latency,
  • optional simple write port for testbench preload,
  • trace events for read and write activity if useful.

Explicit non-goals

  • full 32 MiB EE RAM sizing,
  • cache behavior,
  • arbitration among multiple masters,
  • full integration into the EE-visible memory map,
  • timing fidelity beyond a simple deterministic latency.

Use something small but qword-friendly:

  • default SIZE_BYTES = 4 KiB or 16 KiB

That is large enough for:

  • multiple DMAC packets,
  • future directed tests,
  • no simulator pain.

Keep it simple and aligned with the current stub ecosystem.

Suggested external interface:

  • read request:
    • rd_en
    • rd_addr
    • rd_data
    • rd_valid
  • write request:
    • wr_en
    • wr_addr
    • wr_data
    • wr_be

Recommended data width:

  • 128-bit data path for read and write

Reason:

  • aligns with current DMAC qword semantics,
  • avoids extra packing/unpacking for the first memory-backed DMA proof,
  • keeps the first step focused on ownership, not bus-width adaptation.

Addressing:

  • qword-aligned externally
  • low address bits ignored as needed

Preload mechanism

For the first implementation, allow either:

  • testbench writes through the normal write port, or
  • optional $readmemh preload file parameter

Recommendation:

  • use the normal write port in the integration testbench first

Why:

  • makes the preload path explicit and testable,
  • avoids adding file-format work unless it becomes useful later.

DMAC fetch interface

The direct payload input should be retired from the integrated proof path.

Recommended new DMAC-side interface:

  • mem_rd_en
  • mem_rd_addr
  • mem_rd_data
  • mem_rd_valid

That is enough for the first qword-fetch path.

Ownership and routing

For Wave 2.5:

  • connect dmac_reg_stub directly to ee_ram_stub

Do not route the fetch through ee_memory_map_stub yet.

Reason:

  • ee_memory_map_stub currently only decodes BIOS and unmapped space,
  • forcing DMAC through it now would either require a larger memory-map rewrite or add a fake bypass anyway,
  • direct connection still preserves the key ownership boundary: DMAC is a memory client and RAM is a memory block.

This should be documented as a temporary topology, not the final architecture.

MADR behavior in Wave 2.5

MADR stops being "recorded but ignored" and becomes the actual qword fetch source address.

Recommended behavior:

  • on DMA_START, latch MADR
  • issue one or more qword reads starting at that address
  • increment by 16 bytes per beat

Trace consequence:

  • DMAC trace should show the fetch source address clearly
  • memory trace should show the corresponding RAM read(s)

QWC scope

Recommendation:

  • support QWC == 1 in the first Wave 2.5 implementation
  • make the internal design compatible with QWC > 1
  • do not require multi-beat support for initial signoff

Reason:

  • the point of 2.5 is memory-backed ownership, not throughput expansion
  • QWC > 1 becomes more valuable once the fetch path is stable

Suggested follow-up:

  • call QWC > 1 a Wave 2.6 or Wave 3a extension unless it falls out naturally with low risk

Trace schema additions / refinements

Wave 2 already added:

  • EV_DMA_CFG
  • EV_DMA_START
  • EV_DMA_BEAT
  • EV_DMA_DONE
  • EV_GIFTAG
  • EV_GS_WRITE

Wave 2.5 does not require new event names if the payloads are refined well.

Recommended refinement:

DMAC EV_DMA_START

  • arg0 = channel
  • arg1 = qwc
  • arg2 = MADR
  • arg3 = path id

This is more useful now that MADR matters.

DMAC EV_DMA_BEAT

  • arg0 = channel
  • arg1 = beat index
  • arg2 = source address for this beat
  • arg3 = remaining count

MEM READ from ee_ram_stub

Reuse existing MEM READ event shape:

  • arg0 = address
  • arg1 = low data summary or full qword-low summary
  • arg2 = master id
  • arg3 = region id

Suggested ids for this phase:

  • master id: 1 = DMAC
  • region id: 1 = EE_RAM

This lets traces show the DMAC→RAM→GIF ownership chain without inventing a new event family.

dmac_reg_stub internal changes

Recommended state flow:

  • IDLE
  • FETCH_WAIT
  • ACTIVE_SEND
  • DONE

Suggested behavior:

  1. CPU/testbench writes MADR, QWC, CHCR
  2. DMA_START
  3. issue memory read at MADR + beat_index * 16
  4. wait for mem_rd_valid
  5. present fetched qword to GIF
  6. on accept, increment beat count
  7. if beats remain, fetch next qword
  8. otherwise DMA_DONE

Guardrail

Do not keep both the direct payload port and the memory-fetch path active in the integrated milestone as equal options.

If a compatibility/debug path is retained temporarily, it must be:

  • clearly labeled as deprecated,
  • unused by the main integration testbench.

Integration testbench migration

tb_bgcolor_via_dma.sv should stop driving the direct payload port and instead:

  1. reset the chain
  2. preload one qword into ee_ram_stub through its write port
  3. program MADR to that location
  4. program QWC = 1
  5. trigger CHCR.start
  6. wait for DMA_DONE
  7. verify the same visible color-change outcome as before

Payload preload mechanism

Recommended first mechanism:

  • direct writes into ee_ram_stub through its write port from the testbench

This keeps the setup explicit and simple.

Integrated testbench pass criteria

Keep the current Wave 2 pass criteria stable, with the following additions:

  • at least one MEM READ from ee_ram_stub attributable to DMAC
  • DMA_START trace includes a meaningful MADR
  • DMA_BEAT source-address trace lines match the programmed MADR

Suggested summary line:

[tb_bgcolor_via_dma] mem_reads=1 dma_start=1 dma_done=1 giftag=1 gs_bgcolor=1 bg=(ff,00,00) errors=0
  1. add ee_ram_stub.sv
  2. update dmac_reg_stub.sv to fetch from memory
  3. update tb_bgcolor_via_dma.sv to preload RAM and drive MADR
  4. adjust trace payload details if needed
  5. wire docs / Makefile updates

Guardrails

  • Do not route DMAC through ee_memory_map_stub yet unless Claude wants to deliberately broaden scope.
  • Do not broaden to QWC > 1 unless it falls out naturally.
  • Do not let the preload mechanism become a hidden debug-only side channel.
  • Do not erase the existing Wave 2 trace visibility while refactoring the fetch path.

Exit criteria for mini-plan acceptance

This mini-plan is accepted when Claude can implement:

  1. a new ee_ram_stub.sv,
  2. a memory-master version of dmac_reg_stub,
  3. an updated tb_bgcolor_via_dma.sv that preloads RAM rather than driving the direct payload port,
  4. while preserving the visible Milestone A+ outcome and adding trace-visible proof that MADR is the actual payload source.