Files
retroDE_ps2/docs/ch281_closeout.md
T
thejayman77 ec82764bef Initial commit: retroDE_ps2 — first-of-its-kind PS2 GS FPGA core (DE25-Nano / Agilex 5)
RTL (GS rasterizer, EE core stub, platform bridge, LPDDR4B path), sim regression
(272 TBs), docs, and tooling. Copyrighted PS2 content (BIOS, game code, GS dumps,
and all dump-derived textures/traces) is excluded via .gitignore and stays local.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-29 20:10:50 -04:00

148 lines
5.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Ch281 closeout — MMI3/PNOR (canonical NOT); next blocker is PAND
**Status:** Closed. **Verdict from re-running qbert.elf:**
`elf_first_unsupported_opcode (pc=0x00112C98 instr=0x70431489)`
opcode `0x1C` (MMI) + funct `0x09` (MMI2) + sa `0x12` = **PAND**
(Parallel AND). qbert is now deep into the SIMD byte-walker's
mask-and-reduce stage: PSUBB → PNOR → PAND.
## Numbers
| Chapter | Blocker | qbert retire_count |
|---------|---------|---------------------|
| Post-Ch279 (LQ) | PSUBB at 0x00112C90 | 27,020 |
| Post-Ch280 (PSUBB) | PNOR at 0x00112C94 | 27,021 |
| **Post-Ch281 (PNOR)** | **PAND at 0x00112C98** | **27,022** |
1-retire delta — PNOR retired, PAND traps next.
## What landed
### RTL — 5 surgical edits in `ee_core_stub.sv`
1. **Constants**: `FUNC_MMI3 = 6'h29`, `MMI3_PNOR = 5'h13`.
2. **Decode**: `is_pnor = is_mmi && (func == FUNC_MMI3) &&
(shamt == MMI3_PNOR)`. Same three-way AND as Ch278/Ch280.
3. **`is_rtype_alu` group**: added `is_pnor`.
4. **Writeback (REUSE)**: extended the existing NOR arm to
`else if (is_nor || is_pnor) rtype_alu_wb = ~(rs_val | rt_val)`.
Architectural 128-bit PNOR collapses to a regular 32-bit
bitwise NOR for the low lane.
5. **`is_nop_class` allow**: `!is_pnor` added.
5 LOC of real change. Pure pattern reuse from Ch280 PSUBB
(same MMI narrow-decode shape) plus reuse of the existing
NOR writeback arm.
### Focused TB — `tb_ee_core_pnor.sv`
Three cases:
1. **qbert exact encoding**: `pnor $v1, $zero, $t1`. Encoder
asserted == `0x70091CE9`. With `$t1 = 0x12345678` → `$v1
= ~0x12345678 = 0xEDCBA987`.
2. **NOT-of-zero**: `pnor $t2, $0, $0` → `0xFFFFFFFF`. Both
operands zero; result is all-ones.
3. **General NOR**: `$t3 = 0xF0F0F0F0`, `$t4 = 0x0F0F0F0F`
→ `$t5 = ~(0xF0F0F0F0 | 0x0F0F0F0F) = ~0xFFFFFFFF = 0`.
Locks in the "general two-operand NOR" path even though
qbert's specific usage is the NOT-pseudo form.
Result: `retired=22 halt=1 trap=0 pc=0xbfc0014c errors=0 PASS`.
### Makefile + regression
- `tb_ee_core_pnor` target.
- Added to both regression lists.
- Regression: 168 → **169**.
## qbert's SIMD byte-walker — pipeline shape now clear
Six MMI/load chapters (Ch278Ch281, plus Ch271 SQ and Ch279 LQ)
have surfaced the full byte-walker shape:
```
0x00112C88: lq $t1, 0($a1) ; Ch279 — load 128-bit chunk
0x00112C8C: <one instr we haven't seen the next blocker for>
0x00112C90: psubb $v0, $t1, $t2 ; Ch280 — per-byte subtract
0x00112C94: pnor $v1, $zero, $t1 ; Ch281 — ~$t1 (mask gen)
0x00112C98: pand $v0, $v0, $v1 ; Ch282 — mask the result
... reduction continues ...
```
This is the classic "find a zero byte" or "detect sentinel byte"
SIMD loop — `PSUBB` against a key, `PNOR` to invert the bits,
`PAND` with a mask to isolate the lanes where the condition
holds, then `PMFHL` or similar to reduce to a single GPR for a
branch test.
## Recommendation for Codex's Ch282 — PAND
`0x70431489` at PC `0x00112C98`:
- opcode 0x1C (MMI)
- funct 0x09 (MMI2)
- sa 0x12 (PAND within MMI2)
- rs=$v0, rt=$v1, rd=$v0
- → `pand $v0, $v0, $v1`
Architectural: 128-bit `$rd = $rs & $rt`. For our 32-bit model:
**bit-identical to standard AND** (SPECIAL funct 0x24). Same
shape as PNOR/NOR — different opcode, reused writeback arm.
Implementation outline (mirrors Ch281 PNOR exactly):
1. `localparam MMI2_PAND = 5'h12`.
2. `is_pand = is_mmi && (func == FUNC_MMI2) && (shamt ==
MMI2_PAND)`. The MMI2 funct constant already exists from
Ch278.
3. Add to `is_rtype_alu`.
4. **Reuse the existing AND writeback arm**:
```sv
else if (is_and || is_pand) rtype_alu_wb = rs_val & rt_val;
```
5. Add `!is_pand` to `is_nop_class`.
~4 LOC.
Focused TB:
- Exact qbert encoding asserted == `0x70431489`.
- General AND case: `pand $rd, 0xFFFFFFFF, 0xAAAAAAAA` →
`0xAAAAAAAA`.
- All-zero case: `pand $rd, 0xFFFFFFFF, 0x00000000` → 0.
**Likely follow-ons** after PAND: **PMFHL** (move from HI/LO
low halves) for the reduction — the byte-walker needs to fold
the masked vector down to a scalar for branching. Or
**PEXTLW** (parallel extract low word) for a different
reduction shape.
## Pattern review (11 chapters)
| Ch | Blocker | Edits | Pattern |
|----|---------|-------|---------|
| 271 SQ | first | 5 | NEW 4-beat write |
| 272 DADDU | | 4 | NEW ALU-low-32 |
| 273 SYSCALL HLE | | 2 | NEW gated dispatcher |
| 274 BEQL | | 6 | NEW branch+squash |
| 275 SD | | 7 | REUSE SQ counter |
| 276 DSLL | | 4 | REUSE DADDU |
| 277 BNEL | | 6 | REUSE BEQL squash |
| 278 PCPYLD | | 4 | NEW MMI narrow-decode |
| 279 LQ | | 5 | REUSE LW path |
| 280 PSUBB | | 5 | REUSE MMI narrow (byte-SIMD) |
| **281 PNOR** | | **5** | **REUSE MMI narrow + reuse NOR arm** |
5 NEW patterns + 6 REUSE chapters. The reuse density continues
to climb — Ch282 PAND will be the most-reused chapter yet (MMI
narrow-decode + standard-AND writeback, both already in place).
## Files changed
- `rtl/ee/ee_core_stub.sv` — 5 surgical edits.
- `sim/tb/integration/tb_ee_core_pnor.sv` — new focused TB.
- `sim/Makefile` — target + both regression lists.
## Regression
In flight; expected **169/169**.