RTL (GS rasterizer, EE core stub, platform bridge, LPDDR4B path), sim regression (272 TBs), docs, and tooling. Copyrighted PS2 content (BIOS, game code, GS dumps, and all dump-derived textures/traces) is excluded via .gitignore and stays local. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
5.1 KiB
Ch281 closeout — MMI3/PNOR (canonical NOT); next blocker is PAND
Status: Closed. Verdict from re-running qbert.elf:
elf_first_unsupported_opcode (pc=0x00112C98 instr=0x70431489) —
opcode 0x1C (MMI) + funct 0x09 (MMI2) + sa 0x12 = PAND
(Parallel AND). qbert is now deep into the SIMD byte-walker's
mask-and-reduce stage: PSUBB → PNOR → PAND.
Numbers
| Chapter | Blocker | qbert retire_count |
|---|---|---|
| Post-Ch279 (LQ) | PSUBB at 0x00112C90 | 27,020 |
| Post-Ch280 (PSUBB) | PNOR at 0x00112C94 | 27,021 |
| Post-Ch281 (PNOR) | PAND at 0x00112C98 | 27,022 |
1-retire delta — PNOR retired, PAND traps next.
What landed
RTL — 5 surgical edits in ee_core_stub.sv
- Constants:
FUNC_MMI3 = 6'h29,MMI3_PNOR = 5'h13. - Decode:
is_pnor = is_mmi && (func == FUNC_MMI3) && (shamt == MMI3_PNOR). Same three-way AND as Ch278/Ch280. is_rtype_alugroup: addedis_pnor.- Writeback (REUSE): extended the existing NOR arm to
else if (is_nor || is_pnor) rtype_alu_wb = ~(rs_val | rt_val). Architectural 128-bit PNOR collapses to a regular 32-bit bitwise NOR for the low lane. is_nop_classallow:!is_pnoradded.
5 LOC of real change. Pure pattern reuse from Ch280 PSUBB (same MMI narrow-decode shape) plus reuse of the existing NOR writeback arm.
Focused TB — tb_ee_core_pnor.sv
Three cases:
- qbert exact encoding:
pnor $v1, $zero, $t1. Encoder asserted ==0x70091CE9. With$t1 = 0x12345678→$v1 = ~0x12345678 = 0xEDCBA987. - NOT-of-zero:
pnor $t2, $0, $0→0xFFFFFFFF. Both operands zero; result is all-ones. - General NOR:
$t3 = 0xF0F0F0F0,$t4 = 0x0F0F0F0F→$t5 = ~(0xF0F0F0F0 | 0x0F0F0F0F) = ~0xFFFFFFFF = 0. Locks in the "general two-operand NOR" path even though qbert's specific usage is the NOT-pseudo form.
Result: retired=22 halt=1 trap=0 pc=0xbfc0014c errors=0 PASS.
Makefile + regression
tb_ee_core_pnortarget.- Added to both regression lists.
- Regression: 168 → 169.
qbert's SIMD byte-walker — pipeline shape now clear
Six MMI/load chapters (Ch278–Ch281, plus Ch271 SQ and Ch279 LQ) have surfaced the full byte-walker shape:
0x00112C88: lq $t1, 0($a1) ; Ch279 — load 128-bit chunk
0x00112C8C: <one instr we haven't seen the next blocker for>
0x00112C90: psubb $v0, $t1, $t2 ; Ch280 — per-byte subtract
0x00112C94: pnor $v1, $zero, $t1 ; Ch281 — ~$t1 (mask gen)
0x00112C98: pand $v0, $v0, $v1 ; Ch282 — mask the result
... reduction continues ...
This is the classic "find a zero byte" or "detect sentinel byte"
SIMD loop — PSUBB against a key, PNOR to invert the bits,
PAND with a mask to isolate the lanes where the condition
holds, then PMFHL or similar to reduce to a single GPR for a
branch test.
Recommendation for Codex's Ch282 — PAND
0x70431489 at PC 0x00112C98:
- opcode 0x1C (MMI)
- funct 0x09 (MMI2)
- sa 0x12 (PAND within MMI2)
- rs=$v0, rt=$v1, rd=$v0
- →
pand $v0, $v0, $v1
Architectural: 128-bit $rd = $rs & $rt. For our 32-bit model:
bit-identical to standard AND (SPECIAL funct 0x24). Same
shape as PNOR/NOR — different opcode, reused writeback arm.
Implementation outline (mirrors Ch281 PNOR exactly):
localparam MMI2_PAND = 5'h12.is_pand = is_mmi && (func == FUNC_MMI2) && (shamt == MMI2_PAND). The MMI2 funct constant already exists from Ch278.- Add to
is_rtype_alu. - Reuse the existing AND writeback arm:
else if (is_and || is_pand) rtype_alu_wb = rs_val & rt_val; - Add
!is_pandtois_nop_class.
~4 LOC.
Focused TB:
- Exact qbert encoding asserted ==
0x70431489. - General AND case:
pand $rd, 0xFFFFFFFF, 0xAAAAAAAA→0xAAAAAAAA. - All-zero case:
pand $rd, 0xFFFFFFFF, 0x00000000→ 0.
Likely follow-ons after PAND: PMFHL (move from HI/LO low halves) for the reduction — the byte-walker needs to fold the masked vector down to a scalar for branching. Or PEXTLW (parallel extract low word) for a different reduction shape.
Pattern review (11 chapters)
| Ch | Blocker | Edits | Pattern |
|---|---|---|---|
| 271 SQ | first | 5 | NEW 4-beat write |
| 272 DADDU | 4 | NEW ALU-low-32 | |
| 273 SYSCALL HLE | 2 | NEW gated dispatcher | |
| 274 BEQL | 6 | NEW branch+squash | |
| 275 SD | 7 | REUSE SQ counter | |
| 276 DSLL | 4 | REUSE DADDU | |
| 277 BNEL | 6 | REUSE BEQL squash | |
| 278 PCPYLD | 4 | NEW MMI narrow-decode | |
| 279 LQ | 5 | REUSE LW path | |
| 280 PSUBB | 5 | REUSE MMI narrow (byte-SIMD) | |
| 281 PNOR | 5 | REUSE MMI narrow + reuse NOR arm |
5 NEW patterns + 6 REUSE chapters. The reuse density continues to climb — Ch282 PAND will be the most-reused chapter yet (MMI narrow-decode + standard-AND writeback, both already in place).
Files changed
rtl/ee/ee_core_stub.sv— 5 surgical edits.sim/tb/integration/tb_ee_core_pnor.sv— new focused TB.sim/Makefile— target + both regression lists.
Regression
In flight; expected 169/169.