Files
upbeatBytes/goodnews
thejayman77 2ef0efd909 Perf: skip needless dedup re-cluster + interlock word-search grids
Two things found while chasing the recurring ~15min slowness:

- dedup.py: cluster_duplicates re-ran an O(n²) cosine pass over ALL ~3.7k
  articles and rewrote duplicate_of for every one of them EVERY cycle — even
  when nothing new arrived (embedded=0) — ~53s CPU + a large WAL commit that
  starved live API reads (/api/brief 2-7s). Now skip the re-cluster entirely
  when nothing new was embedded (clusters can't have changed). Verified: cycle
  drops from ~53s to ~1s and /api/brief stays at 20ms through a cycle, vs 2-7s
  before. (A real new article still triggers a full re-cluster.)

- games.py _build_grid: word placement took the first random valid spot, so
  words rarely crossed. Now gather valid placements and PREFER ones that cross
  an already-placed word (shared matching letter), falling back to any valid
  spot — so the grid interlocks like a real word search. Every word still
  placed (tests green). NOTE: changes today's grid layouts, so an in-progress
  word search resets once.

Also added a systemd drop-in (Nice=19/CPUWeight=20/IOWeight=10/ionice-idle) to
deprioritize the batch cycle — minor, the dedup skip is the real fix.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-12 12:35:01 -04:00
..