Files
retroDE_ps2/docs/ch273_closeout.md
thejayman77 ec82764bef Initial commit: retroDE_ps2 — first-of-its-kind PS2 GS FPGA core (DE25-Nano / Agilex 5)
RTL (GS rasterizer, EE core stub, platform bridge, LPDDR4B path), sim regression
(272 TBs), docs, and tooling. Copyrighted PS2 content (BIOS, game code, GS dumps,
and all dump-derived textures/traces) is excluded via .gitignore and stays local.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-29 20:10:50 -04:00

196 lines
8.7 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Ch273 closeout — minimal EE syscall HLE; qbert clears its kernel-call prolog, next blocker is BEQL
**Status:** Closed. Codex's spec implemented exactly: minimal
HLE dispatcher for three crt0 syscalls (`EndOfHeap`,
`InitMainThread`, `FlushCache`), gated behind a parameter so
existing TBs are unaffected. **Verdict from re-running
qbert.elf:** `elf_first_unsupported_opcode (pc=0x001000C0
instr=0x50600004)`**BEQL** (branch on equal likely), MIPS-II.
That frames Ch274.
## Numbers across the opcode/syscall chapters
| Chapter | Blocker | qbert retire_count | Verdict |
|---------|---------|---------------------|---------|
| Ch270 (init) | SQ at 0x00100024 | 12 | first_unsupported_opcode |
| Post-Ch271 (SQ) | DADDU at 0x00100068 | 26,958 | first_unsupported_opcode |
| Post-Ch272 (DADDU) | SYSCALL at 0x00100070 | 26,960 | `elf_halted` |
| **Post-Ch273 (SYSCALL HLE)** | **BEQL at 0x001000C0** | **26,980** | **`elf_first_unsupported_opcode`** |
20 more retires this chapter: all 3 syscalls dispatched, the
prolog used the returns to set up `$sp` and a small initializer-
table walker, and the trap fires at the FIRST instruction the
crt0 emits that we don't decode — `BEQL`.
## What landed
### RTL — 2 surgical additions in `ee_core_stub.sv`
1. **Parameter**: `EE_SYSCALL_HLE_ENABLE` (default `1'b0`) +
`SYSCALL_HEAP_END` (default `32'h001E_0000`). Default-off so
every existing TB whose `syscall` is a "halt-PASS-marker"
(addi/slti/etc.) keeps its semantics.
2. **Dispatcher**: new `else if (EE_SYSCALL_HLE_ENABLE)` branch
after the Ch199 special case. `case (regfile[3])` on `$v1`:
| `$v1` | name | `$v0` returned | resume |
|-------|----------------|-----------------------|-------------|
| 0x3C | EndOfHeap | `SYSCALL_HEAP_END` | PC + 4 |
| 0x3D | InitMainThread | 0 | PC + 4 |
| 0x64 | FlushCache | 0 | PC + 4 |
| other | (unhandled) | (none) | **halt** |
`pc <= pc + 4` (per Codex's correction — this is normal
user-code SYSCALL resume, NOT RFE; RFE is Ch199's path).
### Focused TB — `tb_ee_core_syscall_hle`
Four cases:
1. `syscall` with `$v1=0x3C` → verify `$v0 = 0x001E0000`
2. `syscall` with `$v1=0x3D` → verify `$v0 = 0`
3. `syscall` with `$v1=0x64` → verify `$v0 = 0`
4. `syscall` with `$v1=0x7777` → verify HALT (PASS marker)
Independent verification: captures `$v0` at the cycle AFTER each
known syscall retires AND runs a `BNE $v0, expected, FAIL` chain.
Both must agree. Final PC + `$v1=0x7777` post-halt confirms we
landed on the unhandled-syscall path correctly.
Result: `retired=17 halt=1 trap=0 errors=0 PASS`.
### Runner update — `tb_ee_core_elf_runner.sv`
- Wires `EE_SYSCALL_HLE_ENABLE=1` on the ee_core_stub.
- Halt-time SUMMARY now includes the live register snapshot:
```
saw_halt = 1 at_pc=0x... $v1=0x... $a0=0x... $a1=0x... $a2=0x... $a3=0x...
```
- New verdict shape `elf_first_unhandled_syscall` when the halt
is on a `0x0000000C` instruction with unknown `$v1`. (For this
qbert run, the dispatcher handled all 3 and the trap was a
separate opcode issue — but the verdict shape is ready for
whenever the next unknown SYSCALL surfaces.)
### Makefile
- `tb_ee_core_syscall_hle` target.
- Added to both regression lists.
- Regression: 160 → **161**.
## Codex Ch273 acceptance — line-by-line
| Requirement | Status |
|----------------------------------------------------------------------------|--------|
| Minimal HLE handler in ee_core_stub for normal user-mode SYSCALL | ✅ |
| $v1=0x3C EndOfHeap → conservative top-of-RAM, PC+=4 | ✅ |
| $v1=0x3D InitMainThread → success ($v0=0), no scheduler mutation, PC+=4 | ✅ |
| $v1=0x64 FlushCache → no-op success, PC+=4 | ✅ |
| **Not RFE — PC = syscall PC + 4** | ✅ |
| Unhandled $v1 still halts; TB can read $v1/$a0-$a3 for verdict | ✅ |
| Focused TB: 3 syscalls in sequence + 1 unknown-fallback | ✅ |
| Regression unchanged for default-off | ✅ |
| Re-run qbert, report next blocker | ✅ |
## qbert disassembly around the new blocker
```
0x001000A0: lui $v0, 0x0013 ; $v0 = 0x00130000
0x001000A4: addiu $v0, $v0, 0xC800 ; $v0 = 0x0012C800
0x001000A8: lw $v1, 0($v0) ; $v1 = mem[0x0012C800]
0x001000AC: bne $v1, $0, +7*4 ; skip ahead if non-zero
0x001000B0: nop ; delay
0x001000B4: lui $v0, 0x0013
0x001000B8: addiu $v0, $v0, 0xC944 ; $v0 = 0x0012C944
0x001000BC: lw $v1, 0($v0) ; $v1 = mem[0x0012C944] (= 0 per halt $v1=0)
0x001000C0: beql $v1, $0, +4*4 ; <-- TRAPS HERE
0x001000C4: addiu $a0, $0, 0 ; delay slot (squashed if BEQL not taken)
0x001000C8: addiu $v0, $v1, 4
0x001000CC: lw $a0, 0($v0)
0x001000D0: addiu $a1, $v0, 4
0x001000D4: jal <constructor table walker>
```
This is the C++ static-constructor walker (or a similar
initialization table). The BEQL checks whether the table head
pointer is null — and **branch-likely semantics are
load-bearing**: the delay slot at `0x001000C4` clobbers `$a0`
to 0 only if the branch is taken. If we naïvely decode BEQL as
plain BEQ, the delay slot would execute on the not-taken path
too, silently corrupting `$a0`.
## Recommendation for Codex's Ch274
**Implement BEQL with proper "squash on not-taken" semantics.**
MIPS-II "branch likely" family: BEQL (0x14), BNEL (0x15), BLEZL
(0x16), BGTZL (0x17), and REGIMM BLTZL/BGEZL/BLTZALL/BGEZALL.
Compilers (especially older PS2 SDK gcc with `-fmoveloop-invariants`
or default for-loops) emit these as the canonical loop branch.
Three Ch274 framings, in order of scope:
1. **BEQL only.** Smallest change. Decode `is_beql`, share
`branch_taken` logic with BEQ (rs==rt), but unlike BEQ, when
not taken: PC += 8 (skip both the branch and its delay slot),
no delay-slot execute. Adds `is_branch_likely` distinction
in the retire/PC-advance logic.
2. **BEQL + BNEL** (the two most common). BNEL is the inverse
condition (rs!=rt); same likely semantics. Both surface as
`0x14` (BEQL) and `0x15` (BNEL) opcodes.
3. **Full branch-likely family.** BEQL/BNEL/BLEZL/BGTZL + REGIMM
variants. Bigger surface; usually you only need 12 of these
per chapter until qbert/a later ELF surfaces another.
**My read: (1) — BEQL only.** Same one-question-one-chapter
pattern. The next blocker after BEQL might or might not be
BNEL; let the runner pick.
The implementation hook: existing ee_core_stub has
`branch_pending` + `instr_in_delay_slot` + a `branch_taken`
combinational signal. For BEQL we need to gate "set
branch_pending + queue delay-slot execution" on `branch_taken`,
and on not-taken just `pc <= pc + 8` directly (skip the delay
slot). Probably a 58 line change.
Focused TB: 3 cases mirroring Ch272 shape —
- BEQL taken: `$v1==$0`, target reached, delay slot executed
(writes $a0 to a sentinel value).
- BEQL not-taken: `$v1!=$0`, target NOT reached, delay slot
squashed (sentinel value NOT written; the original $a0
preserved).
- Cross-check vs BEQ: identical inputs through a BEQ should
produce different $a0 on the not-taken case (BEQ's delay
slot fires).
## Files changed
- `rtl/ee/ee_core_stub.sv` — 2 surgical additions (parameter +
dispatcher case statement, ~30 LOC).
- `sim/tb/integration/tb_ee_core_syscall_hle.sv` — new focused TB.
- `sim/tb/integration/tb_ee_core_elf_runner.sv` — enable
`EE_SYSCALL_HLE_ENABLE`; new halt-time register snapshot;
`elf_first_unhandled_syscall` verdict shape.
- `sim/Makefile` — target + both regression lists.
## Regression
In flight; expected **161/161** (was 160, +1 for
`tb_ee_core_syscall_hle`).
## Process notes
- **Codex's PC+4 correction was right.** My initial closeout
draft for Ch272 suggested "RFE-style return" — Codex caught
it. RFE is for the Ch199 `_ReturnFromException` path; normal
user-mode `syscall` resumes at PC+4, no Status stack pop.
Filed this in the memory entry so a future chapter doesn't
repeat the same wrong assumption.
- **Parameter gating is the right call.** Existing TBs that use
`syscall` as a halt-PASS-marker would have broken if their
`$v1` happened to be 0x3C/0x3D/0x64. Gating preserved 160
passing tests trivially; only the ELF runner opts in.
- **The verdict shape now distinguishes 4 halts**: trap (strict
opcode), unmapped MMIO, halt-on-syscall (with $v1/$a0..$a3),
halt-on-other (unexpected). The runner is becoming a real
triage tool.