# Ch272 closeout — DADDU implemented; qbert clears the prolog ALU work, hits SYSCALL #60 **Status:** Closed. **Verdict from re-running qbert.elf:** `elf_halted` — qbert ran past DADDU cleanly and **executed `SYSCALL` at PC 0x00100070** (= `SYSCALL #60`, `EndOfHeap`, the first kernel call in the standard PS2 crt0 prolog). That frames Ch273. ## Numbers | Metric | Ch270 (init) | Post-Ch271 (SQ) | **Post-Ch272 (DADDU)** | |-----------------------|---------------|------------------|-------------------------| | qbert retire_count | 12 | 26,958 | **26,960** | | Verdict | first_unsupported_opcode | first_unsupported_opcode | **`elf_halted`** (new) | | Blocker PC | 0x00100024 | 0x00100068 | 0x00100070 | | Blocker instr / kind | 0x7C400000 (SQ) | 0x0080E02D (DADDU) | 0x0000000C (**SYSCALL**) | The retire delta from Ch271 → Ch272 is small (+2) because the DADDU we implemented is at PC 0x00100068, immediately followed by `addiu $v1, $0, 0x3C` (the syscall number) and `syscall`. The core retires the DADDU + the ADDIU, then halts on the SYSCALL. The chain of next syscalls (61, 100) is queued up at 0x0010008C / 0x0010009C. ## What landed ### RTL — 4 surgical edits in `ee_core_stub.sv` 1. `localparam logic [5:0] FUNC_DADDU = 6'h2D` alongside FUNC_ADDU. 2. `is_daddu` logic decl + `assign is_daddu = is_special && (func == FUNC_DADDU)`. 3. Added `is_daddu` to the `is_rtype_alu` group. 4. Added `is_daddu` to the `(is_add || is_addu)` arm of `rtype_alu_wb` — same low-32-bit add, no overflow trap. Upper 32 bits of the 64-bit DADDU are silently dropped, exactly matching how ADDU already behaves in this stub. Documented in the RTL comment. ### Focused TB — `tb_ee_core_daddu` Three cases per Codex's spec: 1. **Normal add**: `daddu $t0, $a0, $a1` with `$a0=5, $a1=3` → `$t0 = 8`. 2. **Move case (exact qbert encoding)**: builds the literal `0x0080E02D` via `enc_rtype()` and **asserts the produced word equals 0x0080E02D** before installing it — so a future regression to the encoder helper trips loudly here. Then `daddu $gp, $a0, $zero` with `$a0=5` → `$gp = 5`. 3. **Wraparound**: `daddu $t3, $a2, $a2` with `$a2 = 0x80000000` → `$t3 = 0` (low 32 bits wrap). No overflow trap. Post-halt, `trap_events == 0` confirms. Belt-and-braces hierarchical register peeks after halt for $t0/$gp/$t3 so a future BNE-chain regression can't silently pass with wrong values. Result: `retired=17 halt=1 trap=0 pc=0xbfc00138 errors=0 PASS`. Final PC at the PASS syscall slot. ### Makefile + regression - `tb_ee_core_daddu` target. - Added to both PHONY list and `run:` master. - Regression bumps 159 → 160. ## qbert disassembly around the new blocker (PC 0x00100070) Decoded from the qbert.elf file (`python3 -c "..." with struct.unpack`): ``` 0x00100060: 0x3C080010 lui $t0, 0x0010 0x00100064: 0x25080188 addiu $t0, $t0, 0x0188 ; $t0 = 0x00100188 ($gp seed?) 0x00100068: 0x0080E02D daddu $gp, $a0, $0 ; Ch272 — $gp <- $a0 0x0010006C: 0x2403003C addiu $v1, $0, 0x003C ; $v1 = 60 = EndOfHeap 0x00100070: 0x0000000C syscall ; <-- CURRENT BLOCKER 0x00100074: 0x0040E82D daddu $sp, $v0, $0 ; $sp <- $v0 (heap-end addr) 0x00100078: 0x2403003D addiu $v1, $0, 0x003D ; $v1 = 61 = InitMainThread 0x0010007C: 0x3C040014 lui $a0, 0x0014 0x00100080: 0x2484B6E8 addiu $a0, $a0, -0x4918 ; $a0 = 0x0013B6E8 0x00100084: 0x3C050000 lui $a1, 0x0000 0x00100088: 0x24A5FFFF addiu $a1, $a1, -1 ; $a1 = -1 (default stack size) 0x0010008C: 0x0000000C syscall ; SYSCALL #61 0x00100090: 0x00000000 nop 0x00100094: 0x24030064 addiu $v1, $0, 0x0064 ; $v1 = 100 = FlushCache 0x00100098: 0x0000202D daddu $a0, $0, $0 ; $a0 = 0 0x0010009C: 0x0000000C syscall ; SYSCALL #100 ``` This is **textbook PS2 crt0 init**: 1. `EndOfHeap()` returns the end of the heap; result becomes `$sp`. 2. `InitMainThread(stack_addr=0x0013B6E8, stack_size=-1, gp, priority)` initializes the main thread; result presumably also touches `$sp` or returns success. 3. `FlushCache(0)` flushes the instruction cache. If we don't model these, qbert can't even reach `main()`. ## Recommendation for Codex's Ch273 The next blocker is **SYSCALL**, not an opcode. Three Ch273 framings: **(A) Minimal "kernel-stub" SYSCALL dispatch.** Replace the current "halt on any non-Ch199 syscall" with a small case statement keyed on `$v1`. For the three qbert needs immediately: | `$v1` | name | minimum needed | |-------|----------------|--------------------------------------------------------------------------| | 0x3C | EndOfHeap | return `$v0 = 0x001E0000` (or any plausible end-of-RAM); advance PC; RFE | | 0x3D | InitMainThread | return `$v0 = $a0` (or `$a0+$a1`; "stack-base" pattern); advance PC; RFE | | 0x64 | FlushCache | return `$v0 = 0` (no model'd cache); advance PC; RFE | Each case is "set $v0, RFE back to EPC+4." Unhandled syscalls fall through to the existing halt (so we still find the next real blocker). **(B) "Generic-return" SYSCALL.** Make EVERY SYSCALL (other than the Ch199 special case) just set `$v0 = 0` and RFE. Even faster to land, but a syscall that EXPECTS a non-zero return (like `EndOfHeap` returning the heap-end address) would silently misbehave — `$sp` would become 0, and the next LW would AdES-trap or write to garbage. Probably wrong choice. **(C) Full PS2 EE kernel-call dispatcher.** Hundreds of syscalls (`InitMainThread`, `CreateThread`, `WaitSema`, `SifSetReg`, `GsPutIMR`, ...). Out of scope for one chapter. **My read: (A).** Three syscalls, three case arms, three focused TB checks. Same incremental-growth pattern as Ch271/272 but at the system-call level instead of the opcode level. The three values returned (EndOfHeap, InitMainThread, FlushCache) need to be plausible for qbert's downstream code to work. `EndOfHeap` returning 0x001E0000 (1.875 MiB) keeps the stack below the 2 MiB EE-RAM ceiling our TB allocates. The exact return values for `InitMainThread` can probably be "return what would be sensible" — Codex can pick. ## Files changed - `rtl/ee/ee_core_stub.sv` — 4 surgical edits (~6 LOC total). - `sim/tb/integration/tb_ee_core_daddu.sv` — new focused TB. - `sim/Makefile` — `tb_ee_core_daddu` target + both regression lists. ## Regression In flight; expected 160/160 (was 159, +1 for tb_ee_core_daddu). ## Pattern-summary Ch271 + Ch272 = the opcode-by-opcode growth track Codex originally framed. Two chapters, two opcodes, two focused TBs, qbert progresses from 12 → 26,960 retires + clears the entire ALU portion of the prolog. **The runner is doing exactly what it's supposed to do** — surface the next concrete blocker, chapter by chapter. Ch273 is the first non-opcode blocker. It still fits the "one-question-one-chapter" pattern but now the surface is "what should the kernel return for this syscall?" instead of "what does this opcode do?".