Files
retroDE_ps2/docs/decisions/0012-ch347-clut-psmt8-sprite.md
T
thejayman77 ec82764bef Initial commit: retroDE_ps2 — first-of-its-kind PS2 GS FPGA core (DE25-Nano / Agilex 5)
RTL (GS rasterizer, EE core stub, platform bridge, LPDDR4B path), sim regression
(272 TBs), docs, and tooling. Copyrighted PS2 content (BIOS, game code, GS dumps,
and all dump-derived textures/traces) is excluded via .gitignore and stays local.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-29 20:10:50 -04:00

87 lines
6.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 0012 — Ch347: CLUT (PSMT8) textured-alpha sprites
Status: planned (synthetic brick buildable now; authentic acceptance gated on a real capture)
Date: 2026-06-23
## Goal
Extend the Ch344/Ch345a textured-alpha SPRITE path from PSMCT32-only to **PSMT8 indexed (CLUT) textures**:
`TEX0.PSM=0x13` → fetch 8-bit index from VRAM → CLUT → ABGR texel → MODULATE → source-over alpha. This is
the first "real game" GS feature beyond the homebrew corpus (which is anomalously all-PSMCT32); PS2 titles
lean on palettized textures to fit VRAM, so a richer free corpus (Ch347 target: a ScummVM-freeware capture,
Beneath a Steel Sky) forces CLUT. Scope is **PSMT8 only** — PSMT4 (nibble/RMW) deferred unless census forces it.
## Key finding: the CLUT machinery is ~95% already built (search-before-reimplement)
The platform already has, and PROVES for textured TRI/SPRITE **DECAL** (Ch296/297/299/314):
- `clut_stub.sv` — 256×32 CLUT RAM, **two** combinational read ports; one is already dedicated to the
texture sampler (`tex_read_idx``tex_read_data`).
- `clut_loader_stub.sv` — VRAM→CLUT load FSM, CLD-mode policy, PSMCT32/PSMCT16 unpack, `load_busy` guards read2.
- `gs_texel_addr.sv` PSMT8 path — 1 byte/texel linear byte address; `gs_swizzle_psmt8_stub.sv` for swizzle.
- `gs_texture_unit.sv` (Ch296) — byte-lane extract from the 32-bit word + CLUT lookup; output is `.tex_color`.
- gs_stub already decodes TEX0 CLUT fields (CBP/CPSM/CSM/CSA/CLD) and the textured-DECAL gate already
admits PSM 0x13/0x14.
Critically: the Ch344 half-rate sprite datapath captures **`s1_tex_color`**, and `s1_tex_color` IS the
`gs_texture_unit` output (gs_stub.sv:4352) — i.e. already CLUT-decoded for PSMT8. So the CLUT decode happens
upstream of the half-rate capture.
## What actually needs doing
1. **Relax the textured-alpha SPRITE eligibility gate** (`new_tex_abe_active`, gs_stub.sv ~:5114):
`(tex0_psm==6'h00)``(tex0_psm==6'h00 || tex0_psm==6'h13)` (PSMT8). PSMT4 (0x14) left out for v1.
2. **Validate the timing** — the one real risk. PSMT8 adds a byte-lane SELECT; under `TEX_RD_REGISTERED=1`
(the board config) the selector is realigned (`SEL_DELAY`). The Ch344 half-rate capture (ta_tex_q/ta_tex_q1,
the 1-deep texel delay) was tuned to PSMCT32's registered-read latency. We must prove the CLUT-decoded
texel is still valid at the frozen-beat capture for PSMT8 — a COMBINATIONAL-read TB would be a FALSE GREEN
(this exact trap bit Ch344). Use a **registered-read** TB.
3. **CLUT precondition**: a TEX0_1 write with CLD≠0 must fire (loading clut_stub) before the sprite draws —
same precondition as the proven indexed-DECAL path; declared, asserted in the TB.
## Pre-fit synthetic TB (buildable NOW — no capture needed), proving Codex's 5 points
`tb_gs_psmt8_alpha_sprite` (registered-read model, SPRITE_TEX_ALPHA=1, TEX_RD_REGISTERED=1):
1. index fetch hits the right byte (PSMT8 linear address → correct VRAM byte lane);
2. CLUT maps index → ABGR (program clut_stub via a CLD≠0 TEX0 / loader);
3. the **texel's** alpha (from the CLUT entry) drives source-over against the dest;
4. **no read2 collision** regression (texel read on primary beat, dest on frozen beat, CLUT lookup is
combinational — assert no overlap, incl. vs `load_busy`);
5. the **PSMCT32** sprite path stays green (cross-check the existing tb_gs_textured_alpha_sprite + regression).
Acceptance for the synthetic brick: TB passes + full regression + quartus_syn 0-err. This banks the hardware
without claiming authentic content.
## Synthetic ≠ authentic — two separate labels (Codex)
The datapath proof (`tb_gs_psmt8_alpha_sprite`) proves index→CLUT→ABGR→source-over works. It is NOT authentic
CLUT *ingestion*. Authentic PSMT8 additionally requires the emitted TEX0's CLUT-side fields to select a CLUT
that is actually loaded and resident:
- **Screening (DONE, Ch346):** `gs_texture_residency.py` now decodes CBP/CPSM/CSM/CSA/CLD and, for indexed-PSM
(0x13/0x14) candidates, REQUIRES a resident CLUT upload at CBP before the draw (epoch-tracked, same as the
texture) — else REJECT. It also flags CLD=0 (no load trigger -> possibly-stale palette). So `residency_ok()`
won't green-light a PSMT8 candidate whose palette isn't resident.
- **Emission (capture-step TODO):** the feeder/translator must carry the CLUT-side TEX0 fields. Today
`ps2_feeder.c`'s `tex0 TBP TBW TW TH TFX` grammar packs ONLY texture-side fields — it needs CBP/CPSM/CSM/CSA/
CLD added (and the fixture must upload the palette to CBP + a CLD!=0 TEX0 so clut_loader_stub fires). Build
this around the exact Ch346-selected candidate, not speculatively.
## Board-fit guardrail (Codex guardrail 1) — RESOLVED
The "missing HDMI IO_STANDARD" the synth smoke reported was a FALSE alarm: the assignments are present + correct
in the QSF (with an `-entity` qualifier); the scaffold check's regexes were EOL-anchored and didn't tolerate the
qualifier. Fixed 3 checks in sim/Makefile (VIRTUAL_PIN + HDMI/ADV7513 IO_STANDARD). The QSF carries the full
77-source list (incl. osd/qsys platform modules under USE_QSYS_TOP) so the owner's board fit is unaffected.
NOTE: `quartus_syn_only` itself is a reduced smoke (files.f, 115 entries) that OMITS the platform modules, so it
can't fully elaborate the de25 top — a pre-existing smoke-scope limitation, not a board-fit blocker. Quartus
analyzed the Ch347 gs_stub change clean (the 7 elaboration errors are all unrelated platform entities).
## Authentic acceptance (gated on the capture — do NOT commit the target until it exists)
1. Capture a Beneath a Steel Sky (ScummVM-freeware) GS dump.
2. `gs_texture_residency.py` (Ch346) picks a RESIDENT, plausible PSMT8 candidate WITH a resident CLUT —
**prefer a no-wrap footprint** so we don't repeat the Ch345b wrap-mode ambiguity.
3. Extend `ps2_feeder.c`/translator with CLUT-side TEX0 fields + palette upload; emit the scene; software
reference pixel-diffs; then board fit (after confirming the board profile's clut_load_busy wiring).
Provenance: all dump-derived content stays LOCAL/gitignored, same discipline as the cube/sprite fixtures.