- newsimg.purge_source(): when a source leaves 'cache' (permission revoked / re-classified),
the admin image-policy endpoint now deletes that source's re-hosted copies immediately,
rather than leaving them inaccessible-but-on-disk. Endpoint returns {purged}.
- Admin "Engaged readers" carries a warm-up note: tracking began 2026-06-30, so low
rolling windows are partly warm-up, not all bots (compare d7 after a week, the window
after its full span). Guards against misreading "6 engaged vs 135 visits" as 129 bots.
Tests: purge_source removes only the target source's copies; endpoint reports purged.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Admin now shows two numbers:
- Recorded visits: the existing raw count (one daily 'visit' beacon; still includes
UA-spoofing bots that slip past the UA filter).
- Engaged readers: distinct visitor-day with DELIBERATE activity — either the new
gesture-gated 'engaged' beacon (fires once/day only after ~8s visible AND a real
scroll/pointer/key/touch) or a deliberate action (source_click, full_story, share,
replace_used, paywall_replace, not_today/less_like_this/hide_topic, game start/
complete/share). Explicitly EXCLUDES auto-fired visit/summary_viewed/open, replace_none,
and game *_arrival (a share-loop landing, not engagement).
armEngaged() in analytics.js (wired in the global layout) + a mirrored vanilla-JS beacon
on the server-rendered /a/<id> share pages. 'engaged' added to the event allowlist and
fired with article_id=0 so the uniqueness constraint dedups it per day. queries.admin_stats
gains engaged_today/d7/d30. Bots are doubly excluded (UA filter at the beacon + the
gesture gate). Tests cover the metric (engaged + deliberate counted; visit/summary/arrival
not). 447 backend + 36 frontend tests pass.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Fetcher (the two remaining bugs Codex found):
- Real redirects are now followed. _NoRedirect makes urllib RAISE HTTPError on 3xx, so
the old status-branch was dead code (mocked tests masked it). Handle 301/302/303/307/308
HTTPError as redirects (re-validate the destination); classify 4xx≠429 as PERMANENT
(negative-cached), 429/5xx/network as transient. Real-opener redirect + 404/5xx tests.
- The megapixel ceiling is now enforced: explicit `w*h > _MAX_PIXELS` check BEFORE load()
(Pillow only warns at MAX_IMAGE_PIXELS). Test with a lowered ceiling.
Image-rights policy (per Codex + owner decision — only cache what's cleared):
- sources.image_policy: 'cache' (re-host a downscaled copy — license/permission/PD only),
'remote' (hotlink the publisher's image — the conservative DEFAULT), 'none' (no image).
- newsimg.display_url resolves the display URL per policy; applied in Article.from_row so
feed/brief/history return the right URL, and in share.py (og/twitter still reference the
publisher's own image, never re-hosted). warm() + /api/img both gated on 'cache'.
- Frontend uses the server-resolved image_url (reverted the hardcoded /api/img); the
graceful retry covers remote hotlinks too. Admin: per-source image-policy selector +
POST /api/admin/sources/{id}/image-policy. Default 'remote' → nothing re-hosted until
a source is explicitly cleared.
445 backend + 36 frontend tests pass.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Blocker fixes for the image cache:
- /api/img/{id} now serves cache HITS ONLY and is restricted to ACCEPTED, CANONICAL
articles. It never fetches — the cycle (newsimg.warm) owns all fetching — so the
public endpoint has no SSRF/worker-exhaustion surface. Dropped 1-year immutable
caching (image_url can change) → public, max-age=86400.
- newsimg._safe_fetch: SSRF-safe (reuses enrich._host_is_public + _NoRedirect, http(s)
only, every redirect hop re-validated, body capped). _FetchError distinguishes
permanent refusals (negative-cached via a .fail marker) from transient errors (retry).
- _encode re-encodes only decoded RASTER images to WebP and REJECTS everything else
(SVG, undecodable, decompression bombs via MAX_IMAGE_PIXELS, pathological dimensions);
originals are never retained. prune() also sweeps stale .fail markers.
- Concurrency: fetching only runs inside the cycle lock; writes stay atomic.
Smaller fixes:
- share.py visible image has onerror→this.remove() (degrade to the text unfurl, no
broken icon when an image isn't cached yet).
- share-page Back follows history only on a SAME-ORIGIN referrer (never bounce to an
external site); menu now honors Escape + resets crossing back to desktop (HubBar parity).
Tests: private host, redirect-to-private, hostile SVG/non-image, transient-vs-permanent
failure, LRU prune, warm (accepted+canonical only, idempotent), cache-only endpoint
(404 on not-cached/unaccepted/duplicate, never fetches), share chrome parity. 441 pass.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Many modern crawlers (AI scrapers, headless Chrome, link-preview fetchers) run JS and
fire the visit/summary_viewed beacon, inflating "visitors" even though there's no
human discovery channel. Apply queries.is_bot_ua() at /api/events — the same filter
the load-error beacon uses — so honest bot UAs (GPTBot, AhrefsBot, headless Chrome,
python/curl, …) are dropped before recording. Response is identical so a bot can't
detect it. Counts read lower but truer going forward (past rows unchanged). Won't catch
UA-spoofing bots; that needs a heavier heuristic. Tests: bot UAs dropped, real browser
counted; existing event tests send a real UA (default client UA contains "python").
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Stop hotlinking news images from third-party CDNs (the source of the "blank until
you refresh a few times" graphic). New goodnews/newsimg.py caches a downscaled WebP
display copy (≤800px) beside the DB, like art_cache:
- GET/HEAD /api/img/{article_id} — resolves id→image_url (allowlisted to our corpus,
not an open proxy), fetch+cache on first miss, serve local after, immutable headers.
- cycle warms display copies for recent accepted-with-image articles (so the FIRST
view is already local) and prunes to a hard size cap (default 1 GB) by LRU eviction.
Frontend now points at /api/img/<id>: the hub lead, every ArticleCard (feed hero +
cards), and the /a/<id> share page's visible image. og:image/twitter:image stay the
source URL so social crawlers fetch the canonical image directly.
Storage is bounded by construction — over the cap, least-recently-used files are
evicted, so it can't grow without limit regardless of ingest rate. Tests cover
fetch/downscale, cache-hit (no refetch), bad-scheme/non-image rejection, fetch
failure, LRU prune, warm, and the endpoint allowlist.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The load-error log had no way to clear reviewed entries. Add a read_at column to
client_errors and a read/unread model mirroring the feedback inbox:
- GET /api/admin/client-errors?show=unread|read|all (default unread; returns id+read)
- POST /api/admin/client-errors/read-all (mark all unread read)
- POST /api/admin/client-errors/{id}/read {read: bool} (per-row toggle)
Headline stat is now "Unread load errors" (admin_stats.client_errors.unread), so the
red badge clears as you triage. Admin UI: Unread/Read/All tabs, a "Mark all read"
button, and a per-row ✓/↩ toggle; reading an entry drops it from the default view.
14-day auto-prune still bounds the table. Tests cover filter, toggle, mark-all,
404, and gating.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Per the logo + brand: the name is upbeatBytes (camelCase). Swept all user-facing
strings — titles/og:site_name/og:title, logo alt text, share pages (share.py),
emails (email_send), classifier prompt (llm), digest/unsubscribe (api), PWA
manifest, game share text, sign-in, the SPA shell + patch-static-heads (play
title) — plus README/publish.sh and the email test fixture. (SMTP From env was
already upbeatBytes.) Domains (upbeatbytes.com) unchanged. 425 BE + 36 FE green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
queries.feed was the main chokepoint, but several discovery paths have their own
SQL. Apply the shared source exclusion to all of them so "no paywalls" is truly
site-wide:
- briefs.build_daily_brief: EXCLUDE paywalled candidates (was: demote) — never
stored in a new brief.
- queries.brief: stored-brief retrieval (covers /today + /api/brief) filters the
paywalled source.
- digest.digest_items + followed_digest_items: the morning email + "from what you
follow" omit paywalled sources.
- sitemap(): paywalled article pages excluded from the sitemap.
All reuse queries.paywalled_source_ids (admin override still wins).
Regression tests (test_paywall_exclusion.py): never stored in a new brief; /today
+ digest omit it; followed-source email omits it; Saved retains it; 'free'
override restores eligibility. 423 backend tests green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Per Jay: don't surface stories people can't read without paying — it's off-brand
("no paywalls") and pointless. Paywalled is source-level (domain rule, admin-
overridable): just 3 sources today (Nature, New Scientist, MIT Tech Review),
~5.4% of accepted articles.
- queries.paywalled_source_ids(conn): live source set (admin override wins).
- queries.feed gains include_paywalled=False (default) → adds `a.source_id NOT IN
(…)`. One chokepoint covers Latest/tags/sources/moods/topics/search/since AND
the brief top-up. Source-level + SQL → paging stays exact, no frontend change.
- brief(): filter the cached/home pool by the same rule; replacement already
avoids paywalled and now rides the feed exclusion too.
- Dropped the now-moot "paywalled below readable" demotion sort.
- Saved/history keep showing items you saved (their own queries, not excluded).
- test_source_paywall_override updated: paywalled source → excluded from the feed
(was: shown with a badge); 'free' override → returns, no badge. 418 tests green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- daily_art gains blurb + palette columns (idempotent migration).
- art._palette: Pillow median-cut to ~5 hex colors from the cached image (best-
effort → [] on any failure). art._blurb: a warm 2-3 sentence "what you're
looking at" note grounded in the Met catalogue (title/artist/bio/date/medium/
classification/culture/tags). Prompt leans on context/significance and the
title+tags for subject — explicitly NOT asserting literal composition (figure
counts/poses) it can't see, since the model can't view the image. Markdown
stripped from the output.
- pick_daily generates both (client optional → blurb skipped when absent); cycle
+ art CLI pass an LLM client. /api/art/today exposes blurb + palette.
- Backfilled the last 3 days on host (Veteran / Magnolia Vase / Bierstadt).
- scripts/art_blurb_palette_backfill.py for in-place backfill (no re-pick).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Replaces the gist-based read-time with the SOURCE article's full read time — the
contrast that sells the gist ("calm 1-min version here; ~10 min for the deep dive").
- goodnews/readtime.py: word_count_from_html (strips script/style/nav/header/
footer/form/button/aside furniture before counting) + source_read_minutes
(~225 wpm, 200-word floor, None when extraction looks failed/too thin).
- articles.source_words + read_checked_at columns (count only, never the body;
fits the privacy posture). Idempotent migration.
- enrich.fetch_source_words + enrich_read_times: a bounded, retry-guarded cycle
step (mirrors the image enrichers) that counts words for recent accepted
articles. Only ever writes a real count; never overwrites good with zero. Wired
into the cycle after recent-image enrichment.
- queries: source_words flows through _ARTICLE_COLUMNS; api exposes
source_read_minutes on Article (null when unknown).
- home3: News card shows "Full story · ~N min", hidden entirely when null (no
misleading "1 min").
- Tests: furniture stripping, threshold/rounding, enrich idempotency + no
zero-overwrite, API null handling. 412 backend.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Content quality ("LLM polishes, dictionary anchors"):
- New wotd._polish: rewrites the real dictionary gloss into ONE warm plain
sentence + two clear everyday example sentences, grounded in the real
definition (no invented meanings). Stored in new wotd_pool/daily_wotd columns
gloss + usage, alongside the raw definition/examples which stay the anchor.
- harvest() polishes each new word; pick_daily() lazily polishes + caches back
any older pooled word that lacks a gloss (client threaded through run_daily).
- Admin word-add polishes on insert; re-pick passes an LLM client so quote
meaning / word gloss fill on a forced fresh pick.
- /api/word/today now prefers gloss + usage, falling back to the raw dictionary
def/examples when polish is absent (so it's always safe).
- db._migrate adds gloss/usage to wotd_pool + daily_wotd (idempotent ALTER).
Frontend — /word redesigned to CD's "Editorial Asymmetric": faded oversized
initial bleeding off the right, vertical part-of-speech rail, big Newsreader
word, airy definition, left-ruled italic example sentences, outline Listen
button + date. (Uses our self-hosted Newsreader/Hanken stack rather than the
mockup's Google fonts; the made-up syllable respelling is omitted since we only
have real IPA.)
Tests: _polish parse/trim/cap, harvest stores gloss/usage, pick lazy-polishes
older words, admin gloss flows through to /api/word/today. 403 backend + 27 fe.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Admin joy item route moved to /api/admin/joys/{kind}/items/{item_id} so the
/add and /repick verbs resolve to their own routes instead of 422-ing as a
non-int item id (the launch blocker). Frontend mutate URL updated to match.
- Re-pick now excludes the currently-shown item: the endpoint reads today's
daily pool_id and passes it as `avoid`, so "Re-pick today" yields a different
item. Added `avoid` to pick_daily/_candidates across wotd/quote/onthisday.
- WOTD sense selection: the LLM now proposes word + intended part of speech, and
_lookup prefers that sense (fixes "serene" returning the archaic noun).
- On This Day tone prompt tightened to favor genuinely uplifting events and
exclude merely procedural/political-administrative ones.
- Caddy @hidden now also noindexes /word /quote /onthisday /admin (+ .html).
- Regression tests: add/repick resolve (401 not 422), add/feature/block/delete,
re-pick excludes current; WOTD pos-preference + proposal parsing units.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- /home3: small-joys rail now reads live /api/word|quote|onthisday/today (placeholders only
as fallback); each cell links to its detail page.
- HubShell component (shared bar/footer/fonts/tokens) for the hub + detail pages.
- /word: big word, IPA, Listen (cached clip + browser-TTS fallback), definition, sentences.
- /quote: the quote, attribution, and the AI "what it means".
- /onthisday: the date, year + fact, image, summary, source.
- Admin "Small Joys" tab: per-pool list with feature/block/delete/add + re-pick, for all
three kinds. New admin API: GET/POST /api/admin/joys/{kind}[/{id}|/add|/repick].
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- /api/art/image/{id} now answers HEAD as well as GET (was 404 on HEAD) — mirrors the
/a/{id} fix. Added tests/test_art_api.py (GET+HEAD+size=full fallback + today payload).
- /textures/* served immutable (long cache) instead of no-cache; excluded from the
revalidate matcher. Live Caddyfile + repo snapshot both updated.
- Lightbox: Escape closes it, and focus moves to it on open (keyboard-friendly).
- Trimmed the gallery's top padding so "Daily Art" sits closer to the bar.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Virtual frames (Walnut/Gold/Silver/None), selectable + remembered in localStorage,
built as a beveled moulding around a cream museum mat.
- Header uses the real /logo.svg wordmark; the "No ads" pill is replaced by an
account icon (the pill doesn't need to follow every page).
- Lightbox now opens a full-resolution copy that fills the screen: art._download_image
caches a hi-res {id}-full copy alongside the web-large display copy, served via
/api/art/image/{id}?size=full (image_url_large in /api/art/today).
- Centered the placard bullet separators (explicit .sep spans, equal margins).
- Image no longer shifts on hover; a quiet "Click to expand" affordance sits on the art.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Hardening before it runs further on the cycle:
- DB-lock/network: all HTTP (metadata + image) happens before any write; the write txn
opens only at the brief INSERT and commits immediately. Images download to a temp file
then atomic os.replace into cache (a reader never sees a half-written file).
- Site-timezone "daily" already used local_today() (same rhythm as the Brief) — confirmed.
- Attribution from day one: store + return title/artist/date/medium/department/credit/
source_url/object_id/source + museum name + is_public_domain license marker + the full-
res source URL (for a richer /art view later). UI can show: Title · Artist · The Met.
- "highlight != always beautiful": added a manual `blocked` flag on art_pool (excluded
from picks) as the cheap curation lever; a featured override can follow.
Schema migrated (existing art tables get the new columns). 373 tests green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The engine for the /art room (design-independent; deploy held for Codex review).
- goodnews/art.py: harvest a curated pool of public-domain HIGHLIGHT artworks from the
Met (isHighlight+isPublicDomain+hasImages -> masterworks, never potsherds; CC0). Daily
deterministic pick from the least-recently-shown (no soon-repeats, same for everyone),
fetch metadata + download the image to OUR cache (data/art_cache) so the homepage never
waits on or hotlinks the museum. Bulletproof: bad object/image falls through candidates;
a failed day keeps the last piece (room never empty). Injectable HTTP for tests.
- Schema: art_pool + daily_art. /api/art/today (edge-cacheable) + /api/art/image/{id}
(served from cache, immutable). CLI `art [--harvest] [--force]` + a non-fatal cycle step.
- Tests (5, mocked HTTP) + verified live against the Met: harvested 1641 works,
picked/cached "Repose" by John White Alexander. 371 tests green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Hero constraint: _pick_lead now runs only within the CLOSEST non-empty section of a
personalized Brief, so a "gentler" wider-region/world story can never be floated into
the hero slot above a local one. Only widens if the closest section is empty.
- Dial gains a visible Clear (alongside Change) so a reader never feels locked into
personalization; "World" stays the keep-home-but-go-global option.
366 tests green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Codex-approved evolution: the reader controls the "emotional radius" of the landing.
- Census-region "Regional" grain (geo.region_of / region_states). Scope-aware tiering
(queries.home_tiers): closest->widest lead, confidence-gated on state + region, never
a hard filter — blends outward so the set is always full. 'world' = the global brief.
- queries.home_brief takes a scope; /api/brief gains a scope param (nearby|region|
country|world). Country-only / non-US homes collapse to country.
- Homepage dial replaces the 2-button toggle: adaptive stops (4 with a US state, else
Country/World), persisted scope, "Good news closest first" framing. Concrete, soft
section labels (Around New Jersey / Across the Northeast / Across the US / Around the
world) so the reader sees the dial worked.
Backend 366 + frontend tests green. (Latest feed still on v1 local-first; aligning it
to the dial is the immediate follow-up.)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Per the owner's call (overrides the earlier "Brief sacred" stance): when a home is
set, the homepage opens with local good news first, not global. This is the hook —
you land and see awesome stories from YOUR corner first.
- queries.home_brief: local-first highlights (high/medium-confidence near, blended
out to country then world so it's always a full, strong set), preferring already-
summarized stories so the calm read stays rich. Recent window, ranked within tier.
- /api/brief gains a `home` param: private/no-store when set; over-fetches + caps so
dismissal/boundary filtering never thins it; falls back to global top-up if needed.
- Landing UI: a Local <-> Global toggle ("📍 Near you / 🌍 Everywhere") when a home
is set, the calm picker invite when not (dismissible), and Change. Default leads
local; one tap back to the global brief. No home set => exactly today's behavior.
Backend + frontend tests green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Completes the server side of "Closer to Home". /api/feed gains a `home` param
('US' or 'US-NY'); when set the response is private (like prefs) and sectioned:
- Near you (+ Elsewhere in your country when a state is set) is a ONE-TIME lead
block on page 0; the world is the paginated body. next_offset tells the client
where to continue, so the lead block never skews world paging.
- Thin tiers fold down (MIN_TIER=3) so a header is never shown empty (lead, don't trap).
- State match counts only on high/medium geo confidence; the "country" tier excludes
exactly what went to "near", so a low-confidence home-state story still surfaces
(it doesn't vanish between tiers — caught + tested).
- Items carry a `section` tag; paywalled sort is now within-section. No home => exact
prior behavior (section null, default/edge-cached feed unchanged), Brief untouched.
364 tests green. Frontend next: Home picker + sectioned feed rendering.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Foundation for "Closer to Home" (server-side, Codex-approved). No behavior change
yet — geo_scope defaults None, so the default/edge-cached feed is identical.
- queries.feed now returns each article's geo (breadth, confidence, and ISO-coded
places) via a LEFT JOIN + places subquery. Article.from_row parses geo_places
into [{country, state}]. Brief query doesn't select geo, so the Brief stays bare.
- queries.feed gains home-scope filters (home_country/home_state/geo_scope =
near|country|world): STATE match only counts on high/medium geo confidence;
untagged articles fall to 'world' so nothing is lost during backfill.
Next: API composition (home param + near/country/world sectioning with soft/blended
headers + a next_offset pagination model) and the Home picker UI. 360 tests green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sharpen the existing daily-game share loop into something measurable (per Codex's
"instrument what you have, then feed people into it" plan), ahead of a Show HN launch.
Analytics:
- Per-game funnel events <game>_{arrival,started,completed,shared} (article_id=0).
arrival = landed via a shared link (utm_source=game_share); started = first move
(guess/find/flip); completed = solved/cleared/Full Bloom; shared = on share success.
- trackVisit() moved into the global layout so direct /play landings count; the
server-rendered /a/ share page now creates a visitor token + sends a daily visit
beacon (first-time /a/-only visitors were previously dropped).
- Admin "Games funnel" panel: arrivals / engaged / completed / shared, per game.
Sharing:
- Memory Match gains a Share button (it was the only game without one).
- All shares deep-link to the exact game+variant with a full https:// URL +
utm_source=game_share (gameShareUrl helper), instead of a bare /play.
- "shared" is counted only after navigator.share()/clipboard.writeText() succeeds.
/play social metadata:
- /play served homepage canonical/OG (static SPA, ssr=false). postbuild script
patches build/play.html's head to /play canonical/title/description/OG; fails the
build if the homepage tags drift. Caddy try_files now serves {path}.html so /play
is served from the patched file (snapshot in deploy/caddy/).
Tests: backend 352, frontend 27.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The deploy pipeline runs from the working tree, so a wave of shipped features
had never been committed. This snapshots git to what's actually running.
SEO impression recovery (live + verified):
- Duplicate /a/{id} now 301-redirect to their canonical twin instead of 404
(a hard 404 silently dropped already-indexed URLs and tanked impressions).
- Dedup representative selection reworked: accepted/serveable -> established
rep (URL stability) -> quality score, so an accepted page never retires to a
rejected rep and an indexed canonical doesn't churn when a newer twin arrives.
- HEAD /a/{id} returns the same status as GET (api_route GET+HEAD) instead of
falling through to the static mount and 404ing.
- `dedup --force-recluster`: cycle-locked, model-free re-cluster to re-apply the
policy to the existing corpus (shared cycle_lock context manager).
- CLI honors GOODNEWS_DB for its default --db (was silently ignored).
Publishing Desk (admin tool to post highlights to X via Web Intents):
- publishing.py queue/rank/handle-resolution; admin UI; full searchable emoji
picker (bundled data, no CDN) for the blurb editor.
Play games + site:
- Bloom (word-wheel), Memory Match, daily ritual set, Zen Den (dev-gated).
- English-only language gate; source prospecting; paywall + dedup hardening.
Tests: full suite green (349). Ignores tightened (node_modules, data/*.db).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The Articles inspector revealed paywall is domain-coarse: nytimes.com is flagged,
so NY Times Learning's free Word-of-the-Day inherits 🔒 — and that flag isn't
cosmetic, it deprioritizes the content in feed sort + lead selection. Add a
per-source override so admins can correct it after inspecting.
- sources.paywall_override: NULL (domain rule) | 'free' | 'paywalled'.
- paywall.py: keep low-level is_paywalled(url) (domain); add is_paywalled_for_source
(url, override) for the EFFECTIVE decision — never patched the domain helper
globally (per Codex), so "domain says X" stays distinguishable from "overridden".
- Threaded everywhere ranking/UI touches paywall, via src.paywall_override on the
shared _ARTICLE_COLUMNS + the source-aware helper: feed sort, /api/since, replace,
lead selection, Article badge, brief composition (briefs.py), digest, source_health
(table 🔒), the Articles inspector, and the review/attention check — so ranking and
UI always agree.
- Endpoint POST /api/admin/sources/{id}/paywall {override}; admin UI: a select in the
inspector header (Use domain rule / Treat as free / Treat as paywalled) + the basis
("ON (domain)" / "OFF (override)"), optimistic so the panel stays open.
Test: domain rule → paywalled in table+inspector+feed badge; 'free' → off in all
three; validation 422 + 404. 242 pytest + 11 vitest.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
New per-row "Articles" button on the Sources table expands a read-only inline
panel of the source's ACTUAL ingested articles — so the automated metrics
(paywall/image/acceptance/duplicate) can be verified against evidence instead of
trusted blind. Distinct from "Check" (which re-samples the LIVE feed for
would-pass quality); this shows what's already in the DB, which is what the table
metrics are computed from.
- Backend: GET /api/admin/sources/{id}/articles?filter=&limit=&offset= (admin,
read-only). queries.source_articles + source_articles_summary — per article:
title, url, date, accepted, reason (the "why"), topic/flavor, paywalled
(domain rule), has_image, duplicate. Summary = counts + source-level paywall
rule.
- Frontend: expandable panel with a summary header ("27 ingested · 18 accepted
· … · paywall rule: ON (domain)"), filter chips (All/Accepted/Rejected/No
image/Duplicates), compact rows with title→link + badges + reason, Load more.
So "100% paywall" or "0% images" becomes clickable evidence: open two articles
to tell a real paywall from a mis-flagged domain, or a true image gap from an
enrichment failure. Test: test_source_articles_inspector. 241 pytest + 11 vitest.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Don't trust client JSON at the storage layer:
- sanitize_game_state() runs before merge AND on the merged result (heals legacy
rows). Word Search: keep only finds whose cells actually spell a real word in
that day's grid (validated when the puzzle exists, shape-only 4-12 alpha +
cell-length otherwise), dedupe, renumber ci. Word: validate status enum, guess
count/length/alpha, colour-row shape, terminal answer/why.
- Completion is now derived from the real puzzle word count (foundWords ==
expected), not a client-sent `ms` — so stats can't be inflated by junk.
- Date validated as YYYY-MM-DD at the API (400 otherwise) — no junk/future rows.
Tests: sanitizer-rejects-junk + bad-date 400; existing tests updated to use
real-shaped data (the sanitizer is a good forcing function). 237 pytest + 11
vitest green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Two game polish items:
- Word Search: overlapping cells now multiply-blend the crossing words' colours
(deepening to a darker shade with readable text) instead of the newest colour
stomping the rest — matches the new interlocking grids.
- Cross-device game-state sync (signed-in): per-puzzle progress + stats now
follow you between devices. New game_state table; server-side merge on every
save so two devices converge regardless of push order, tailored per game:
* Word Search → UNION of finds (monotonic; can't un-find), earliest start,
best completion time.
* Word → furthest-progress wins (terminal beats in-progress; more guesses
beats fewer) — picks one device's game whole, never splices guesses.
Stats (streak/distribution/best) derived server-side from the synced states,
so they're consistent instead of per-device counters. Endpoints GET/PUT
/api/games/state + GET /api/games/stats (signed-in; size-capped). Frontend is
local-first: games paint instantly from localStorage, then reconcile in the
background; both game components push debounced on each move and adopt the
merge. Conflict handling unit-tested + an API two-device convergence test.
235→ tests + 11 vitest green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Two concrete latency wins found by measuring (server compute is 2-17ms; the time
is in the path, not the box):
- Admin panel fired its 6 API calls SEQUENTIALLY (await chain) — so it paid the
uncached origin round-trip six times back-to-back. Now one Promise.all batch.
This is the admin lag.
- /api/brief (the home "Gathering the good news…" content) wasn't edge-cached, so
a distant anonymous visitor triggered a Cloudflare→residential-origin pull.
Same global/shareable boundary as /api/feed: public s-maxage=45 when no
prefs/exclude, else private,no-store. (Needs /api/brief added to the CF cache
rule path list to take effect at the edge.)
Tests: test_brief_cache_boundary. 228 pytest + 11 vitest.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
"Gathering the good news…" waits on the home's startup API calls, which were all
DYNAMIC → a round-trip to the residential origin every load (the occasional 2-3s
linger). These responses depend only on the URL, never the session, so they're
safe to share at the edge:
- /api/moods, /api/categories (static config) → public, s-maxage=900
- /api/lanes, /api/families (global, data-derived counts) → public, s-maxage=120
- /api/feed → public, s-maxage=45 ONLY when shareable (no following / prefs /
exclude); the following feed (reads the session) and personal filters stay
private, no-store.
Hard personalization boundary, explicit per-endpoint (no blanket /api/* rule).
Pairs with a Cloudflare cache rule (added separately) making these paths
eligible. Tests assert the global endpoints are public+s-maxage and the feed
boundary (default/topic public; following/prefs/exclude private). 227 pytest.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Two small server-side tweaks so the endpoint matches the UI policy:
- Rename is refused (409) for promoted/rejected candidates — they're settled
history; the UI already hides Rename for them, now the server enforces it too.
- Name is capped at 160 chars before save, so an accidental pasted paragraph
can't wreck the queue layout.
Tests extended: 300-char name truncates to 160; renaming a promoted candidate
→ 409. 225 pytest + 11 vitest green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A staged candidate could only be renamed by rejecting and re-adding it, which
churns the queue and discards the preview just to fix a typo. Add an inline
Rename on each candidate: a "Rename" pill swaps the name for an input
(Enter saves · Esc cancels), POST /api/admin/candidates/{id}/rename →
sources.rename_candidate(). Empty clears the name (promote then derives one
from the feed host). Preview is preserved; the fixed name carries into promotion.
Tests: test_candidate_rename (rename in place keeps preview, promotes with the
new name, gated + 404). 225 pytest + 11 vitest green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Three follow-ups from Codex's audit of the deep-preview/search/dedup work:
- Promote-time duplicate guard: promote_candidate() now re-checks
find_existing_feed() and raises DuplicateFeedError → 409, so an
old/CLI/direct-DB candidate or a race can't bypass the add-time check and
silently overwrite a live source's settings via upsert. (sources scanned
first, so a real source collision wins over the candidate matching itself.)
- postJSON/putJSON/delJSON gain opt-in {timeout} (AbortController, default
none so other calls are unchanged); deep preview uses 120s and surfaces a
calm "timed out" message instead of pinning the button on "Deep-checking…"
if the LAN model stalls.
- feed_key() now lowercases the host only, not the whole URL — paths/queries
can be case-significant; scheme/www/trailing-slash/host-case still collapse.
Tests: test_candidate_deep_preview_and_dedup extended — promote succeeds once,
then a re-promote of the same candidate is refused 409. 224 pytest + 11 vitest.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Three admin Sources upgrades:
- Deep preview: a per-candidate "🔬 Deep preview" button runs the REAL
classifier on an 8-item sample (the same model that judges live articles),
versus the fast keyword heuristic the add/Re-preview path uses. Preview now
carries `classified`, surfaced as a "model-checked" vs "quick estimate"
badge — so the acceptance % is no longer ambiguously heuristic. conn is
released during the ~30-60s model pass; postJSON has no client timeout.
- Search: free-text box over the sources table (name / category / feed URL /
homepage), folded into the existing status filter, with a live match count
and empty state. Makes "is this already added?" a glance.
- Duplicate-add guard: sources.find_existing_feed() + feed_key() normalize
scheme/www/trailing-slash/case, so re-adding a feed that's already a live
source or a queued candidate is refused with a 409 naming where it lives
(DB already enforced exact-URL uniqueness; this catches the near-miss
variants and overwrite-on-promote footgun).
Tests: test_candidate_deep_preview_and_dedup (deep flag wires the model +
uses the small sample; exact/www/slash/case variants all 409). 224 pytest +
11 vitest green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Codex's finding: cache-as-you-go would pin files Caddy deliberately serves
no-cache (version.json, manifest, word lists, icons) in the SW cache until the
next SW version — silently defeating the revalidate policy for controlled
clients. version.json is the critical one (it's how the app detects a fresh
deploy); stale word lists could drift from the server's validated answer pool.
New isMutablePath() exclusion: the SW steps aside and the browser HTTP cache
revalidates these per their headers.
Telemetry polish (also Codex): the boot beacon now fills the app_version
column with the entry chunk's hashed filename scraped from the shell's own
modulepreload link (no extra fetch) — deploy-correlated load errors become
obvious. Admin list returns + shows it.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The first boot-slow capture (5763ms total, html 68ms) proved the white screen
happens AFTER the shell arrives — but not which fetch eats the time. Append
the 3 slowest resource entries (path, start→end, transferSize; sz0 ≈ served
from SW/cache) so the next slow boot names its culprit. Reason cap 300→500
client+server to fit the detail.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Root cause of the intermittent white screen: the shell HTML is no-cache
(cf-cache-status: DYNAMIC), so every page-open does a synchronous round-trip
to the residential origin before any pixel renders — and the SW's network-first
navigation only fell back to the cached shell on REJECTION, never on slowness.
A stalled fetch meant staring at white with a perfectly good shell in cache.
The boot seatbelt couldn't see it either: it lives inside the HTML that hadn't
arrived yet, so slow boots left no telemetry.
- service-worker: race navigation fetch vs 2.5s grace timer. Network wins →
fresh HTML as before; timer/5xx/failure → cached shell instantly, network
response still refreshes the cache in the background. Safe due to the 14-day
immutable-chunk grace window. Caps the white screen at ~2.5s for repeat
visitors on any network.
- app.html: beacon `boot-slow: Nms (html Nms) on 4g` when mount takes >4s —
the "white screen, then it loaded" glitches finally leave a trace, with
HTML-arrival timing to separate slow-origin from slow-JS.
- admin: bot UAs (HeadlessChrome/bot/spider/crawl/…) excluded from the
headline "Load errors today" count — throttled crawlers trip the 10s boot
check routinely (the one recorded error was HeadlessChrome on X11, not a
phone). Bots stay visible in the list, tagged + dimmed.
Tests: telemetry test extended for bot flag + filtered counts. 223 pytest +
11 vitest green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Daily Word pool curation, full add/delete/import — no redeploys to fix tone:
- Remove ANY pool word, curated or admin-added, via a word_pool_removed
tombstone table. Runtime pool = (static ∪ added) − removed, so even a
baked-in word can be pulled on negative feedback. Reversible: a "Removed"
list with one-tap Restore lifts the tombstone. Lookup now surfaces a Remove
button when in-pool, Restore when removed.
- Import a vetted list (paste or .txt/.csv upload, read client-side): validates
each word (alpha · 5–6 · in guess dictionary), ignores duplicates, and reports
rejects with reasons. Re-adding/importing a removed word lifts its tombstone.
- Word Search theme delete already existed (Edit/Remove per theme) — verified.
Pool stays the clean 251/224; today's noisy LLM enrichment is discarded.
Tests: +tests/test_pool_admin.py, extended test_word_pool_admin. 222 pytest +
11 vitest green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* New "Word Search themes" panel in the Games tab: enter a theme name + words,
with live validation (4–8 letters, alpha, deduped) and a count vs the 28 needed
to fill all three sizes. An "✨ Suggest a word" button asks the LLM for one
fresh word that fits the theme. Save/edit/remove; authored themes join the daily
fallback rotation alongside the curated ones (wordsearch_themes table). The
system still handles word distribution across sizes + placement.
* Daily Word pool's added-word chips now scroll within a bounded area so the
console stays tidy as the list grows.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* client_error details, not just a count: new client_errors table + POST
/api/client-error (reason/path/user-agent/time) + GET /api/admin/client-errors.
The boot-seatbelt beacon now sends the reason + path (once per page); the admin
Overview lists the recent errors so we can tell chunk vs SW vs API vs JS — the
truth meter for the next day as the new SW propagates.
* Deploy warming now also hits the shell, routes (/play /account /admin), SW,
version.json, word lists, and icons/logo/font — not just immutable chunks.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Per Codex. A branded recovery card in app.html shows if the app hasn't mounted
in 7s, or on a pre-mount JS error/unhandledrejection — with a "Refresh Upbeat
Bytes" button. A chunk/preload failure (vite:preloadError) reloads once
(sessionStorage-guarded). +layout calls window.__ubBooted() on mount to clear
the card + timer. A pre-mount failure also fires a tiny anonymous client_error
beacon; the admin Overview now shows "Load errors today" (red if >0) so we can
see if blank-risk is happening in the wild.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
First games admin tool. A "Games" tab in the operator console for the Daily Word
answer pool.
* Lookup: is a word real (in the guess dictionary), the right length (5/6), and
already in the pool — instant as you type.
* Add: appends to the pool, enforcing the invariant (alpha · 5/6 letters · in the
guess dict) so the daily answer is always guessable. Remove: drops admin-added
words (curated static ones stay).
* Additions persist in a new word_pool table (survives redeploys, unlike the
baked-in JSON); the daily picker reads static pool ∪ DB additions. Guess dicts
shipped with the package (goodnews/data/words-5/6.json) for server-side
validation. Admin-gated endpoints + tests.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
From playtesting findings:
* Pools nearly doubled (115/104 → 228/201) with calm/neutral everyday words
(claps, dance, drench, beach…), not just strictly-upbeat ones — more variety,
~7-month runway. The post-solve "why" prompt reworded to fit neutral words.
* Word Search now stores one theme + word list per day; the grid is built per
request for three SIZE tiers — Small (8×8, 6 words), Medium (11×11, 9),
Large (14×14, 13). Large packs more words = a longer sit ("too fast" fix).
All sizes share the day's theme; every size still code-placed + solvable.
* Word Search themes can now be neutral everyday scenes ("Around the house",
"At the beach", "In the kitchen", "A walk outdoors", "Making music"…), not
only hopeful — same shape as the articles.
* Each found word gets its own colour from a calm palette, in the grid and its
word-list chip. Per-size local progress + best time.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A calm second daily game, same philosophy as Daily Word — LLM proposes, code
disposes.
* LLM proposes a hopeful theme + ~8 words; code validates (alpha/length/dedup)
and PLACES every word in a date-seeded grid, so the puzzle is always solvable.
Curated fallback themes if the LLM is thin. Only placed words are returned;
the solution cells (placements) are never sent to the client.
* GET /api/puzzle/wordsearch → {theme, words, grid, size}. No answer to hide:
the grid and word list are meant to be seen — the play is finding them, which
the client validates by reading the selected line off the grid.
* WordSearchGame.svelte: pointer-drag selection snapped to the 8 straight
directions (mouse + touch), found-word highlighting, no-fail, no pressure
timer — time is recorded quietly and shown at the end with a personal best.
Spoiler-free share. localStorage progress (restores found cells + timer).
* Hub's Word Search card is now live with today's status; cycle pre-generates
both games with the LLM.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Per Codex's v2 hardening. The GET /api/puzzle/word response no longer carries
the answer at all — guesses POST to /api/puzzle/word/guess and the server
returns the colour pattern, computed against the day's answer. The answer (and
the "why") are revealed only once solved or the guesses are spent. This removes
the "open DevTools, read the answer" issue without pretending to be a fortress
(a deliberate crafted request can still peek; there's no leaderboard or prize,
so that's fine). Client keeps local progress/stats; dict validation stays
client-side. Trade-off accepted: each guess needs the API (the site already
depends on it for today's content).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>