177 Commits

Author SHA1 Message Date
thejayman77 2dc4419024 images/analytics: purge on policy revoke + engagement warm-up note (Codex close-out)
- newsimg.purge_source(): when a source leaves 'cache' (permission revoked / re-classified),
  the admin image-policy endpoint now deletes that source's re-hosted copies immediately,
  rather than leaving them inaccessible-but-on-disk. Endpoint returns {purged}.
- Admin "Engaged readers" carries a warm-up note: tracking began 2026-06-30, so low
  rolling windows are partly warm-up, not all bots (compare d7 after a week, the window
  after its full span). Guards against misreading "6 engaged vs 135 visits" as 129 bots.
Tests: purge_source removes only the target source's copies; endpoint reports purged.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-30 14:29:55 -04:00
thejayman77 f416e13700 analytics: honest engagement metric — Engaged readers vs Recorded visits (Codex)
Admin now shows two numbers:
- Recorded visits: the existing raw count (one daily 'visit' beacon; still includes
  UA-spoofing bots that slip past the UA filter).
- Engaged readers: distinct visitor-day with DELIBERATE activity — either the new
  gesture-gated 'engaged' beacon (fires once/day only after ~8s visible AND a real
  scroll/pointer/key/touch) or a deliberate action (source_click, full_story, share,
  replace_used, paywall_replace, not_today/less_like_this/hide_topic, game start/
  complete/share). Explicitly EXCLUDES auto-fired visit/summary_viewed/open, replace_none,
  and game *_arrival (a share-loop landing, not engagement).

armEngaged() in analytics.js (wired in the global layout) + a mirrored vanilla-JS beacon
on the server-rendered /a/<id> share pages. 'engaged' added to the event allowlist and
fired with article_id=0 so the uniqueness constraint dedups it per day. queries.admin_stats
gains engaged_today/d7/d30. Bots are doubly excluded (UA filter at the beacon + the
gesture gate). Tests cover the metric (engaged + deliberate counted; visit/summary/arrival
not). 447 backend + 36 frontend tests pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-30 14:07:24 -04:00
thejayman77 8a7606e20d images: fix two fetcher bugs + add source-level image-rights policy (Codex)
Fetcher (the two remaining bugs Codex found):
- Real redirects are now followed. _NoRedirect makes urllib RAISE HTTPError on 3xx, so
  the old status-branch was dead code (mocked tests masked it). Handle 301/302/303/307/308
  HTTPError as redirects (re-validate the destination); classify 4xx≠429 as PERMANENT
  (negative-cached), 429/5xx/network as transient. Real-opener redirect + 404/5xx tests.
- The megapixel ceiling is now enforced: explicit `w*h > _MAX_PIXELS` check BEFORE load()
  (Pillow only warns at MAX_IMAGE_PIXELS). Test with a lowered ceiling.

Image-rights policy (per Codex + owner decision — only cache what's cleared):
- sources.image_policy: 'cache' (re-host a downscaled copy — license/permission/PD only),
  'remote' (hotlink the publisher's image — the conservative DEFAULT), 'none' (no image).
- newsimg.display_url resolves the display URL per policy; applied in Article.from_row so
  feed/brief/history return the right URL, and in share.py (og/twitter still reference the
  publisher's own image, never re-hosted). warm() + /api/img both gated on 'cache'.
- Frontend uses the server-resolved image_url (reverted the hardcoded /api/img); the
  graceful retry covers remote hotlinks too. Admin: per-source image-policy selector +
  POST /api/admin/sources/{id}/image-policy. Default 'remote' → nothing re-hosted until
  a source is explicitly cleared.

445 backend + 36 frontend tests pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-30 14:01:11 -04:00
thejayman77 a55ba185a8 images: harden the cache per Codex audit (SSRF-safe, cache-only endpoint, WebP-only)
Blocker fixes for the image cache:
- /api/img/{id} now serves cache HITS ONLY and is restricted to ACCEPTED, CANONICAL
  articles. It never fetches — the cycle (newsimg.warm) owns all fetching — so the
  public endpoint has no SSRF/worker-exhaustion surface. Dropped 1-year immutable
  caching (image_url can change) → public, max-age=86400.
- newsimg._safe_fetch: SSRF-safe (reuses enrich._host_is_public + _NoRedirect, http(s)
  only, every redirect hop re-validated, body capped). _FetchError distinguishes
  permanent refusals (negative-cached via a .fail marker) from transient errors (retry).
- _encode re-encodes only decoded RASTER images to WebP and REJECTS everything else
  (SVG, undecodable, decompression bombs via MAX_IMAGE_PIXELS, pathological dimensions);
  originals are never retained. prune() also sweeps stale .fail markers.
- Concurrency: fetching only runs inside the cycle lock; writes stay atomic.

Smaller fixes:
- share.py visible image has onerror→this.remove() (degrade to the text unfurl, no
  broken icon when an image isn't cached yet).
- share-page Back follows history only on a SAME-ORIGIN referrer (never bounce to an
  external site); menu now honors Escape + resets crossing back to desktop (HubBar parity).

Tests: private host, redirect-to-private, hostile SVG/non-image, transient-vs-permanent
failure, LRU prune, warm (accepted+canonical only, idempotent), cache-only endpoint
(404 on not-cached/unaccepted/duplicate, never fetches), share chrome parity. 441 pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-30 12:19:57 -04:00
thejayman77 ee43bb0df6 analytics: filter known-bot User-Agents at /api/events (honest visitor counts)
Many modern crawlers (AI scrapers, headless Chrome, link-preview fetchers) run JS and
fire the visit/summary_viewed beacon, inflating "visitors" even though there's no
human discovery channel. Apply queries.is_bot_ua() at /api/events — the same filter
the load-error beacon uses — so honest bot UAs (GPTBot, AhrefsBot, headless Chrome,
python/curl, …) are dropped before recording. Response is identical so a bot can't
detect it. Counts read lower but truer going forward (past rows unchanged). Won't catch
UA-spoofing bots; that needs a heavier heuristic. Tests: bot UAs dropped, real browser
counted; existing event tests send a real UA (default client UA contains "python").

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-30 11:19:51 -04:00
thejayman77 86d9897113 ui: reserve the scrollbar gutter so the top bar stops shifting between pages
Pages tall enough to scroll showed a ~15px scrollbar; short pages didn't — so the
centered top bar jumped left/right as you navigated. scrollbar-gutter: stable on html
(SPA app.css + the server-rendered share pages) keeps the layout width constant. No-op
on overlay-scrollbar platforms (mobile), which never shifted.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-30 04:52:59 -04:00
thejayman77 3740e09d02 share pages: carry the real HubBar toolbar (consistency with the SPA)
The server-rendered /a/<id> and digest pages predated "HubBar everywhere" and showed
a stripped bar (logo + a bespoke Back pill). They can't run the Svelte component, so
add a hand-kept static replica of HubBar (logo + News/Play/Art nav + account glyph +
mobile burger/drop-panel) plus HubShell's borderless ← Back. A signed-in reader's
avatar paints from the same localStorage cache HubBar uses. /a/<id> now looks like any
detail page (/art, /word). Reusable _top_bar_html/_TOP_BAR_CSS/_TOP_BAR_JS/_back_link
helpers; applied to both share pages. Kept in sync with HubBar.svelte by hand (noted).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-30 04:26:01 -04:00
thejayman77 8a3c00db3b images: cache + serve article images from our own origin (bounded, LRU-evicted)
Stop hotlinking news images from third-party CDNs (the source of the "blank until
you refresh a few times" graphic). New goodnews/newsimg.py caches a downscaled WebP
display copy (≤800px) beside the DB, like art_cache:
- GET/HEAD /api/img/{article_id} — resolves id→image_url (allowlisted to our corpus,
  not an open proxy), fetch+cache on first miss, serve local after, immutable headers.
- cycle warms display copies for recent accepted-with-image articles (so the FIRST
  view is already local) and prunes to a hard size cap (default 1 GB) by LRU eviction.
Frontend now points at /api/img/<id>: the hub lead, every ArticleCard (feed hero +
cards), and the /a/<id> share page's visible image. og:image/twitter:image stay the
source URL so social crawlers fetch the canonical image directly.

Storage is bounded by construction — over the cap, least-recently-used files are
evicted, so it can't grow without limit regardless of ingest rate. Tests cover
fetch/downscale, cache-hit (no refetch), bad-scheme/non-image rejection, fetch
failure, LRU prune, warm, and the endpoint allowlist.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-29 20:28:33 -04:00
thejayman77 d98cec9ded admin: read/unread triage for load errors (unread by default, mark read/all)
The load-error log had no way to clear reviewed entries. Add a read_at column to
client_errors and a read/unread model mirroring the feedback inbox:
- GET /api/admin/client-errors?show=unread|read|all (default unread; returns id+read)
- POST /api/admin/client-errors/read-all  (mark all unread read)
- POST /api/admin/client-errors/{id}/read {read: bool}  (per-row toggle)
Headline stat is now "Unread load errors" (admin_stats.client_errors.unread), so the
red badge clears as you triage. Admin UI: Unread/Read/All tabs, a "Mark all read"
button, and a per-row ✓/↩ toggle; reading an entry drops it from the default view.
14-day auto-prune still bounds the table. Tests cover filter, toggle, mark-all,
404, and gating.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-29 10:38:22 -04:00
thejayman77 0ae789752e fix: QOTD/WOTD freshness — pick within the freshest cohort, not the rotated pool
Both selectors ordered candidates least-recently-shown, then daily.seeded_order()
ROTATED the whole list and took [0] — an arbitrary date-hashed item, undoing the
ordering. Result: repeats (quote id 2 on 6/28+6/29; word "harmony" on 6/25+6/28),
no guarantee a pool item is shown before it recurs.

Fix: daily.freshest(rows) returns the freshest cohort only — every NEVER-shown
item while any remain, else the oldest-shown group. quote/wotd _candidates use it;
seeded_order now picks deterministically WITHIN that cohort. So every pool item is
featured once before any repeat, then cycles oldest-first. Dropped the unused
_NO_REPEAT_POOL window. Tests: no-repeat-until-exhausted (quote + wotd) + a
freshest() unit test. 428 backend tests green.

(Separate follow-up: expand the QOTD pool from 16 → 90+ vetted public-domain
quotes for a longer no-repeat window.)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-29 05:39:06 -04:00
thejayman77 667b1a82c3 brand: standardize "Upbeat Bytes" → "upbeatBytes" everywhere
Per the logo + brand: the name is upbeatBytes (camelCase). Swept all user-facing
strings — titles/og:site_name/og:title, logo alt text, share pages (share.py),
emails (email_send), classifier prompt (llm), digest/unsubscribe (api), PWA
manifest, game share text, sign-in, the SPA shell + patch-static-heads (play
title) — plus README/publish.sh and the email test fixture. (SMTP From env was
already upbeatBytes.) Domains (upbeatbytes.com) unchanged. 425 BE + 36 FE green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-28 20:01:20 -04:00
thejayman77 2cfffdfd6a NEWS RELAUNCH CUTOVER: promote the hub to /, feed to /news, go public
The big flip. /home3 (hub) becomes /; the feed lives at /news; both indexable.
- PROMOTE: routes/+page.svelte is now the hub (was the interim NewsFeed wrapper);
  noindex removed; "Read more good news" → /news. routes/home3 + home2 deleted.
- routes/+page.js: redirects legacy root-query links (/?view=latest, /?tag, /?source,
  /?q, /?view=today→highlights) to /news before the hub renders (no flash).
- /news: noindex dropped (route meta + Caddy @newsHidden removed); now public.
- LINKS: HubBar brand/Home → /, News default → /news; HubShell/art/play back → /;
  account Following + share.py Explore/Browse/source → /news.
- FOOTER: one shared Footer.svelte (motto + Send feedback + slot) across Hub/News/
  Play/Art/HubShell/Account/Zen; global layout footer removed (FeedbackModal stays).
- SITEMAP: + /news /art /play /word /quote /onthisday; cap 5k→50k; gated on
  has-summary; paywalled excluded; HEAD now 200 (api_route GET+HEAD).
- Head-patcher: /news entry. PWA + shell description broadened to the hub.
- Caddy: @newsHidden dropped; @hidden now admin-only (word/quote/onthisday public);
  /home2,/home3 → / 301. Mirrored to deploy/caddy snapshot.

425 backend + 36 frontend tests green; build clean; Caddy valid.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-28 19:16:43 -04:00
thejayman77 1c1ecefde8 news: harden paywall exclusion at the candidate query + add the missing regressions
Codex's two non-blocking hardening items, folded in before cutover:
- _candidate_articles() now excludes paywalled sources IN-QUERY (before LIMIT 50),
  so flagged stories can't consume candidate slots and leave a full brief thin.
  Dropped the now-redundant post-fetch filter in build_daily_brief.
- Regressions: history retains a viewed paywalled article; sitemap omits a
  paywalled source AND restores it under override="free".
- Aligned test_brief_paywall to the source-level model (paywalled sources carry a
  paywalled homepage, as in production) — it had relied on article-URL detection.

425 backend tests green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-28 18:54:53 -04:00
thejayman77 c600145ba5 news: close the remaining no-paywall bypass paths (Codex audit)
queries.feed was the main chokepoint, but several discovery paths have their own
SQL. Apply the shared source exclusion to all of them so "no paywalls" is truly
site-wide:
- briefs.build_daily_brief: EXCLUDE paywalled candidates (was: demote) — never
  stored in a new brief.
- queries.brief: stored-brief retrieval (covers /today + /api/brief) filters the
  paywalled source.
- digest.digest_items + followed_digest_items: the morning email + "from what you
  follow" omit paywalled sources.
- sitemap(): paywalled article pages excluded from the sitemap.
All reuse queries.paywalled_source_ids (admin override still wins).

Regression tests (test_paywall_exclusion.py): never stored in a new brief; /today
+ digest omit it; followed-source email omits it; Saved retains it; 'free'
override restores eligibility. 423 backend tests green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-28 17:22:52 -04:00
thejayman77 0d21231597 news: hard-exclude paywalled sources from the feed + brief (no unreadable news)
Per Jay: don't surface stories people can't read without paying — it's off-brand
("no paywalls") and pointless. Paywalled is source-level (domain rule, admin-
overridable): just 3 sources today (Nature, New Scientist, MIT Tech Review),
~5.4% of accepted articles.

- queries.paywalled_source_ids(conn): live source set (admin override wins).
- queries.feed gains include_paywalled=False (default) → adds `a.source_id NOT IN
  (…)`. One chokepoint covers Latest/tags/sources/moods/topics/search/since AND
  the brief top-up. Source-level + SQL → paging stays exact, no frontend change.
- brief(): filter the cached/home pool by the same rule; replacement already
  avoids paywalled and now rides the feed exclusion too.
- Dropped the now-moot "paywalled below readable" demotion sort.
- Saved/history keep showing items you saved (their own queries, not excluded).
- test_source_paywall_override updated: paywalled source → excluded from the feed
  (was: shown with a badge); 'free' override → returns, no badge. 418 tests green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-28 17:10:00 -04:00
thejayman77 6c10ad99a9 On This Day: serve sharp images (originalimage, not the 330px thumbnail)
The Wikimedia feed's thumbnail is 330px, which upscales blurry in our hero. Use
originalimage.source instead — it's reliably sharp. (Can't just request a bigger
thumbnail width: for very large source images Wikimedia only serves pre-generated
bucket sizes and 400s on arbitrary widths — e.g. 500px ok, 800/1024px fail.)

- onthisday._best_image() prefers originalimage, falls back to the thumbnail.
- scripts/otd_image_upsize_backfill.py re-fetches each stored MM-DD and upgrades
  image_url in onthisday_pool + daily_onthisday in place (ran on host: pool + 6
  daily rows now sharp; today's hero verified 200). Only the /onthisday hero
  loads this image (home card is text-only), so larger files are a single-page,
  one-time load.
- test_best_image locks the prefer-original/fallback behavior.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-27 17:07:37 -04:00
thejayman77 ed814c97b9 Daily Art engine: museum-guide blurb (grounded LLM) + extracted palette
- daily_art gains blurb + palette columns (idempotent migration).
- art._palette: Pillow median-cut to ~5 hex colors from the cached image (best-
  effort → [] on any failure). art._blurb: a warm 2-3 sentence "what you're
  looking at" note grounded in the Met catalogue (title/artist/bio/date/medium/
  classification/culture/tags). Prompt leans on context/significance and the
  title+tags for subject — explicitly NOT asserting literal composition (figure
  counts/poses) it can't see, since the model can't view the image. Markdown
  stripped from the output.
- pick_daily generates both (client optional → blurb skipped when absent); cycle
  + art CLI pass an LLM client. /api/art/today exposes blurb + palette.
- Backfilled the last 3 days on host (Veteran / Magnolia Vase / Bierstadt).
- scripts/art_blurb_palette_backfill.py for in-place backfill (no re-pick).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 20:12:54 -04:00
thejayman77 dc23277b38 Read-time: full-article "Full story · ~N min" badge (Option B)
Replaces the gist-based read-time with the SOURCE article's full read time — the
contrast that sells the gist ("calm 1-min version here; ~10 min for the deep dive").

- goodnews/readtime.py: word_count_from_html (strips script/style/nav/header/
  footer/form/button/aside furniture before counting) + source_read_minutes
  (~225 wpm, 200-word floor, None when extraction looks failed/too thin).
- articles.source_words + read_checked_at columns (count only, never the body;
  fits the privacy posture). Idempotent migration.
- enrich.fetch_source_words + enrich_read_times: a bounded, retry-guarded cycle
  step (mirrors the image enrichers) that counts words for recent accepted
  articles. Only ever writes a real count; never overwrites good with zero. Wired
  into the cycle after recent-image enrichment.
- queries: source_words flows through _ARTICLE_COLUMNS; api exposes
  source_read_minutes on Article (null when unknown).
- home3: News card shows "Full story · ~N min", hidden entirely when null (no
  misleading "1 min").
- Tests: furniture stripping, threshold/rounding, enrich idempotency + no
  zero-overwrite, API null handling. 412 backend.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 08:09:00 -04:00
thejayman77 bdf3b1f47b WOTD _polish: enforce the "examples must use the word" contract
Per Codex audit: only accept a polish when there's a gloss AND at least one
example sentence that actually contains the word (case-insensitive). Examples
that don't use the word are dropped; if none remain, fall back to the raw
dictionary def/examples instead of shipping a gloss with empty/irrelevant usage.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 07:56:22 -04:00
thejayman77 cebbed58ab WOTD #4/#5 content quality + Editorial Asymmetric /word page (CD)
Content quality ("LLM polishes, dictionary anchors"):
- New wotd._polish: rewrites the real dictionary gloss into ONE warm plain
  sentence + two clear everyday example sentences, grounded in the real
  definition (no invented meanings). Stored in new wotd_pool/daily_wotd columns
  gloss + usage, alongside the raw definition/examples which stay the anchor.
- harvest() polishes each new word; pick_daily() lazily polishes + caches back
  any older pooled word that lacks a gloss (client threaded through run_daily).
- Admin word-add polishes on insert; re-pick passes an LLM client so quote
  meaning / word gloss fill on a forced fresh pick.
- /api/word/today now prefers gloss + usage, falling back to the raw dictionary
  def/examples when polish is absent (so it's always safe).
- db._migrate adds gloss/usage to wotd_pool + daily_wotd (idempotent ALTER).

Frontend — /word redesigned to CD's "Editorial Asymmetric": faded oversized
initial bleeding off the right, vertical part-of-speech rail, big Newsreader
word, airy definition, left-ruled italic example sentences, outline Listen
button + date. (Uses our self-hosted Newsreader/Hanken stack rather than the
mockup's Google fonts; the made-up syllable respelling is omitted since we only
have real IPA.)

Tests: _polish parse/trim/cap, harvest stores gloss/usage, pick lazy-polishes
older words, admin gloss flows through to /api/word/today. 403 backend + 27 fe.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 06:08:14 -04:00
thejayman77 84b1fb514f Small joys: Codex audit #2 fixes (route resolution, noindex, sense/tone, exclude-current re-pick)
- Admin joy item route moved to /api/admin/joys/{kind}/items/{item_id} so the
  /add and /repick verbs resolve to their own routes instead of 422-ing as a
  non-int item id (the launch blocker). Frontend mutate URL updated to match.
- Re-pick now excludes the currently-shown item: the endpoint reads today's
  daily pool_id and passes it as `avoid`, so "Re-pick today" yields a different
  item. Added `avoid` to pick_daily/_candidates across wotd/quote/onthisday.
- WOTD sense selection: the LLM now proposes word + intended part of speech, and
  _lookup prefers that sense (fixes "serene" returning the archaic noun).
- On This Day tone prompt tightened to favor genuinely uplifting events and
  exclude merely procedural/political-administrative ones.
- Caddy @hidden now also noindexes /word /quote /onthisday /admin (+ .html).
- Regression tests: add/repick resolve (401 not 422), add/feature/block/delete,
  re-pick excludes current; WOTD pos-preference + proposal parsing units.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-22 20:19:02 -04:00
thejayman77 3bde6534e9 Small joys: wire homepage rail to live data + rich pages (/word /quote /onthisday) + admin
- /home3: small-joys rail now reads live /api/word|quote|onthisday/today (placeholders only
  as fallback); each cell links to its detail page.
- HubShell component (shared bar/footer/fonts/tokens) for the hub + detail pages.
- /word: big word, IPA, Listen (cached clip + browser-TTS fallback), definition, sentences.
- /quote: the quote, attribution, and the AI "what it means".
- /onthisday: the date, year + fact, image, summary, source.
- Admin "Small Joys" tab: per-pool list with feature/block/delete/add + re-pick, for all
  three kinds. New admin API: GET/POST /api/admin/joys/{kind}[/{id}|/add|/repick].

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-22 18:52:38 -04:00
thejayman77 67d4bc32cb Small joys: Quote of the Day + Word of the Day engines
- quote.py: curated public-domain quote pool (16 seeded, admin-grows), deterministic daily
  pick, lazy AI "what it means" explainer of the real quote (cached). No LLM-invented quotes.
- wotd.py: LLM proposes positive words → validated/enriched against dictionaryapi.dev (real
  definition, IPA, examples, audio) → audio clip cached to our origin (TTS fallback) →
  deterministic daily pick. Tops the pool up toward 30/day.
- db.py: quote_pool/daily_quote + wotd_pool/daily_wotd tables.
- api.py: /api/quote/today, /api/word/today, /api/word/audio/{word} (GET+HEAD).
- cli.py: cycle steps for both (under --no-joys), shared LLM client.
- tests: test_quote.py (6) + test_wotd.py (5). 393 backend tests green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-22 17:28:55 -04:00
thejayman77 a7da8362ab Small joys backend: shared daily framework + On This Day engine
- goodnews/daily.py: shared helpers for the daily "small joys" (http_json, date-seeded
  deterministic pick, dedup key) so each joy is a small self-contained module.
- goodnews/onthisday.py: harvest today's MM-DD from Wikimedia's On-this-day feed →
  tone-filter to good/neutral (keyword floor + optional LLM refine) → pool → deterministic
  daily pick (idempotent, respects blocked/featured) → cached row. Network/LLM before any
  DB write. Multi-source ready (source column).
- db.py: onthisday_pool + daily_onthisday tables.
- api.py: GET /api/onthisday/today (edge-cacheable).
- cli.py: cycle step (run after Daily Art; --no-joys to skip), LLM client for tone refine.
- tests/test_onthisday.py: 7 tests (filter+dedup, pick idempotent, blocked/featured,
  never-empty, empty-pool, LLM-narrow). 382 backend tests green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-22 16:51:29 -04:00
thejayman77 dd8706e2fc Art post-audit polish (Codex): image HEAD, texture immutable cache, lightbox a11y, spacing
- /api/art/image/{id} now answers HEAD as well as GET (was 404 on HEAD) — mirrors the
  /a/{id} fix. Added tests/test_art_api.py (GET+HEAD+size=full fallback + today payload).
- /textures/* served immutable (long cache) instead of no-cache; excluded from the
  revalidate matcher. Live Caddyfile + repo snapshot both updated.
- Lightbox: Escape closes it, and focus moves to it on open (keyboard-friendly).
- Trimmed the gallery's top padding so "Daily Art" sits closer to the bar.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-21 18:17:30 -04:00
thejayman77 27788ba2a8 Art page round 2: virtual frames, real logo, hi-res zoom, spacing/affordance polish
- Virtual frames (Walnut/Gold/Silver/None), selectable + remembered in localStorage,
  built as a beveled moulding around a cream museum mat.
- Header uses the real /logo.svg wordmark; the "No ads" pill is replaced by an
  account icon (the pill doesn't need to follow every page).
- Lightbox now opens a full-resolution copy that fills the screen: art._download_image
  caches a hi-res {id}-full copy alongside the web-large display copy, served via
  /api/art/image/{id}?size=full (image_url_large in /api/art/today).
- Centered the placard bullet separators (explicit .sep spans, equal margins).
- Image no longer shifts on hover; a quiet "Click to expand" affordance sits on the art.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-21 16:25:31 -04:00
thejayman77 db967bb7fa Daily Art: Codex guardrails (atomic image, attribution/license, blocked lever)
Hardening before it runs further on the cycle:
- DB-lock/network: all HTTP (metadata + image) happens before any write; the write txn
  opens only at the brief INSERT and commits immediately. Images download to a temp file
  then atomic os.replace into cache (a reader never sees a half-written file).
- Site-timezone "daily" already used local_today() (same rhythm as the Brief) — confirmed.
- Attribution from day one: store + return title/artist/date/medium/department/credit/
  source_url/object_id/source + museum name + is_public_domain license marker + the full-
  res source URL (for a richer /art view later). UI can show: Title · Artist · The Met.
- "highlight != always beautiful": added a manual `blocked` flag on art_pool (excluded
  from picks) as the cheap curation lever; a featured override can follow.

Schema migrated (existing art tables get the new columns). 373 tests green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-21 15:28:38 -04:00
thejayman77 308516a263 Daily Art backend: curated Met pool, daily cached pick, /api/art (prototype)
The engine for the /art room (design-independent; deploy held for Codex review).

- goodnews/art.py: harvest a curated pool of public-domain HIGHLIGHT artworks from the
  Met (isHighlight+isPublicDomain+hasImages -> masterworks, never potsherds; CC0). Daily
  deterministic pick from the least-recently-shown (no soon-repeats, same for everyone),
  fetch metadata + download the image to OUR cache (data/art_cache) so the homepage never
  waits on or hotlinks the museum. Bulletproof: bad object/image falls through candidates;
  a failed day keeps the last piece (room never empty). Injectable HTTP for tests.
- Schema: art_pool + daily_art. /api/art/today (edge-cacheable) + /api/art/image/{id}
  (served from cache, immutable). CLI `art [--harvest] [--force]` + a non-fatal cycle step.
- Tests (5, mocked HTTP) + verified live against the Met: harvested 1641 works,
  picked/cached "Repose" by John White Alexander. 371 tests green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-21 14:50:20 -04:00
thejayman77 0c68c22221 Brand consistency: emails say "upbeatBytes" (From + digest body)
Per the brand-name standard (camelCase, one word). Updated the SMTP From default and
the digest email body/subject strings. Live env From values (auth.env + goodnews.env)
updated to match. (Web/OG brand strings in share.py + app.html are the remaining sweep.)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-21 11:38:16 -04:00
thejayman77 b4b02b5050 Scope dial polish (Codex): hero stays closest-first + visible Clear
- Hero constraint: _pick_lead now runs only within the CLOSEST non-empty section of a
  personalized Brief, so a "gentler" wider-region/world story can never be floated into
  the hero slot above a local one. Only widens if the closest section is empty.
- Dial gains a visible Clear (alongside Change) so a reader never feels locked into
  personalization; "World" stays the keep-home-but-go-global option.

366 tests green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 22:06:06 -04:00
thejayman77 3486f3102a Scope dial v2: Nearby / Region / Country / World radius on the homepage
Codex-approved evolution: the reader controls the "emotional radius" of the landing.

- Census-region "Regional" grain (geo.region_of / region_states). Scope-aware tiering
  (queries.home_tiers): closest->widest lead, confidence-gated on state + region, never
  a hard filter — blends outward so the set is always full. 'world' = the global brief.
- queries.home_brief takes a scope; /api/brief gains a scope param (nearby|region|
  country|world). Country-only / non-US homes collapse to country.
- Homepage dial replaces the 2-button toggle: adaptive stops (4 with a US state, else
  Country/World), persisted scope, "Good news closest first" framing. Concrete, soft
  section labels (Around New Jersey / Across the Northeast / Across the US / Around the
  world) so the reader sees the dial worked.

Backend 366 + frontend tests green. (Latest feed still on v1 local-first; aligning it
to the dial is the immediate follow-up.)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 21:59:32 -04:00
thejayman77 d2a6293a13 Local-first Brief: the landing leads with good news from your home
Per the owner's call (overrides the earlier "Brief sacred" stance): when a home is
set, the homepage opens with local good news first, not global. This is the hook —
you land and see awesome stories from YOUR corner first.

- queries.home_brief: local-first highlights (high/medium-confidence near, blended
  out to country then world so it's always a full, strong set), preferring already-
  summarized stories so the calm read stays rich. Recent window, ranked within tier.
- /api/brief gains a `home` param: private/no-store when set; over-fetches + caps so
  dismissal/boundary filtering never thins it; falls back to global top-up if needed.
- Landing UI: a Local <-> Global toggle ("📍 Near you / 🌍 Everywhere") when a home
  is set, the calm picker invite when not (dismissible), and Change. Default leads
  local; one tap back to the global brief. No home set => exactly today's behavior.

Backend + frontend tests green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 21:36:18 -04:00
thejayman77 2239549799 Closer to Home: gate "Near you" on high/medium confidence (both modes)
Codex polish before deploy: anything elevated as Near you / Close to home must have
geo_confidence in (high, medium) — the feature's promise is relevance. Country-only
mode now gates "near" too; since it has no "country" tier, the "world" scope is
widened to absorb low-confidence home-country stories so they surface there instead
of vanishing between tiers (the same edge-case class, fixed). State mode unchanged.

364 tests green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 20:29:31 -04:00
thejayman77 e7e8f5515e Geo Stage 4 (server): home-aware feed sectioning (Near you / country / world)
Completes the server side of "Closer to Home". /api/feed gains a `home` param
('US' or 'US-NY'); when set the response is private (like prefs) and sectioned:

- Near you (+ Elsewhere in your country when a state is set) is a ONE-TIME lead
  block on page 0; the world is the paginated body. next_offset tells the client
  where to continue, so the lead block never skews world paging.
- Thin tiers fold down (MIN_TIER=3) so a header is never shown empty (lead, don't trap).
- State match counts only on high/medium geo confidence; the "country" tier excludes
  exactly what went to "near", so a low-confidence home-state story still surfaces
  (it doesn't vanish between tiers — caught + tested).
- Items carry a `section` tag; paywalled sort is now within-section. No home => exact
  prior behavior (section null, default/edge-cached feed unchanged), Brief untouched.

364 tests green. Frontend next: Home picker + sectioned feed rendering.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 19:35:22 -04:00
thejayman77 ad4e88c8f2 Geo Stage 4 (data layer): geo on feed responses + home-scope query filters
Foundation for "Closer to Home" (server-side, Codex-approved). No behavior change
yet — geo_scope defaults None, so the default/edge-cached feed is identical.

- queries.feed now returns each article's geo (breadth, confidence, and ISO-coded
  places) via a LEFT JOIN + places subquery. Article.from_row parses geo_places
  into [{country, state}]. Brief query doesn't select geo, so the Brief stays bare.
- queries.feed gains home-scope filters (home_country/home_state/geo_scope =
  near|country|world): STATE match only counts on high/medium geo confidence;
  untagged articles fall to 'world' so nothing is lost during backfill.

Next: API composition (home param + near/country/world sectioning with soft/blended
headers + a next_offset pagination model) and the Home picker UI. 360 tests green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 19:30:43 -04:00
thejayman77 1c05554a28 Geo Stage 1-2: subject-geography model + classifier + pipeline wiring
"Closer to Home" foundation (audit greenlit by Codex). Durable geography, kept
decoupled from volatile scoring.

- Schema: article_geo (breadth/confidence/rationale/geo_version) + article_places
  (0..N ISO-coded places), separate from article_scores so re-runs/audits never
  disturb scoring or acceptance. "local" is never stored — it's relative to the
  reader; the UI computes "Near you" later.
- geo.py: LLM proposes place NAMES, code disposes to ISO codes (country alpha-2,
  US state 2-letter); region words like "Europe" can never become a country.
  'global'/placeless is first-class, not failure. Confidence calibrated so 'high'
  needs an explicit location. Geo is its OWN LLM pass, not merged into the scoring
  prompt (durable metadata, re-runnable, keeps the sensitive prompt untouched).
- store_geo replaces places (geo is re-derivable, unlike scores). tag_articles is
  idempotent by geo_version, only touches accepted non-duplicate articles.
- CLI `geo` command (cycle-locked, --limit/--reclassify) for backfill, plus a
  bounded geo step in the cycle (--geo-limit 60, --no-geo). scripts/geo_audit.py
  is the prototype audit tool.

360 tests green; live smoke tagged real articles correctly (Gaza->PS, London->GB,
placeless science->global). No UI / SEO pages yet — ranking/personalization only.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 16:56:49 -04:00
thejayman77 59ff48ae90 Game share-loop: instrument funnel, deep-link shares, /play metadata
Sharpen the existing daily-game share loop into something measurable (per Codex's
"instrument what you have, then feed people into it" plan), ahead of a Show HN launch.

Analytics:
- Per-game funnel events <game>_{arrival,started,completed,shared} (article_id=0).
  arrival = landed via a shared link (utm_source=game_share); started = first move
  (guess/find/flip); completed = solved/cleared/Full Bloom; shared = on share success.
- trackVisit() moved into the global layout so direct /play landings count; the
  server-rendered /a/ share page now creates a visitor token + sends a daily visit
  beacon (first-time /a/-only visitors were previously dropped).
- Admin "Games funnel" panel: arrivals / engaged / completed / shared, per game.

Sharing:
- Memory Match gains a Share button (it was the only game without one).
- All shares deep-link to the exact game+variant with a full https:// URL +
  utm_source=game_share (gameShareUrl helper), instead of a bare /play.
- "shared" is counted only after navigator.share()/clipboard.writeText() succeeds.

/play social metadata:
- /play served homepage canonical/OG (static SPA, ssr=false). postbuild script
  patches build/play.html's head to /play canonical/title/description/OG; fails the
  build if the homepage tags drift. Caddy try_files now serves {path}.html so /play
  is served from the patched file (snapshot in deploy/caddy/).

Tests: backend 352, frontend 27.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-18 16:22:06 -04:00
thejayman77 89c0fbe1f6 Sync repo to deployed state: SEO recovery, Publishing Desk, Play games, emoji picker
The deploy pipeline runs from the working tree, so a wave of shipped features
had never been committed. This snapshots git to what's actually running.

SEO impression recovery (live + verified):
- Duplicate /a/{id} now 301-redirect to their canonical twin instead of 404
  (a hard 404 silently dropped already-indexed URLs and tanked impressions).
- Dedup representative selection reworked: accepted/serveable -> established
  rep (URL stability) -> quality score, so an accepted page never retires to a
  rejected rep and an indexed canonical doesn't churn when a newer twin arrives.
- HEAD /a/{id} returns the same status as GET (api_route GET+HEAD) instead of
  falling through to the static mount and 404ing.
- `dedup --force-recluster`: cycle-locked, model-free re-cluster to re-apply the
  policy to the existing corpus (shared cycle_lock context manager).
- CLI honors GOODNEWS_DB for its default --db (was silently ignored).

Publishing Desk (admin tool to post highlights to X via Web Intents):
- publishing.py queue/rank/handle-resolution; admin UI; full searchable emoji
  picker (bundled data, no CDN) for the blurb editor.

Play games + site:
- Bloom (word-wheel), Memory Match, daily ritual set, Zen Den (dev-gated).
- English-only language gate; source prospecting; paywall + dedup hardening.

Tests: full suite green (349). Ignores tightened (node_modules, data/*.db).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-18 11:32:27 -04:00
thejayman77 2dbe73430c Sources: per-source paywall override (3-state) — fix domain-rule mis-flags
The Articles inspector revealed paywall is domain-coarse: nytimes.com is flagged,
so NY Times Learning's free Word-of-the-Day inherits 🔒 — and that flag isn't
cosmetic, it deprioritizes the content in feed sort + lead selection. Add a
per-source override so admins can correct it after inspecting.

- sources.paywall_override: NULL (domain rule) | 'free' | 'paywalled'.
- paywall.py: keep low-level is_paywalled(url) (domain); add is_paywalled_for_source
  (url, override) for the EFFECTIVE decision — never patched the domain helper
  globally (per Codex), so "domain says X" stays distinguishable from "overridden".
- Threaded everywhere ranking/UI touches paywall, via src.paywall_override on the
  shared _ARTICLE_COLUMNS + the source-aware helper: feed sort, /api/since, replace,
  lead selection, Article badge, brief composition (briefs.py), digest, source_health
  (table 🔒), the Articles inspector, and the review/attention check — so ranking and
  UI always agree.
- Endpoint POST /api/admin/sources/{id}/paywall {override}; admin UI: a select in the
  inspector header (Use domain rule / Treat as free / Treat as paywalled) + the basis
  ("ON (domain)" / "OFF (override)"), optimistic so the panel stays open.

Test: domain rule → paywalled in table+inspector+feed badge; 'free' → off in all
three; validation 422 + 404. 242 pytest + 11 vitest.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-12 22:10:44 -04:00
thejayman77 ddcfab3a11 Admin: source Articles inspector (verify metrics against real evidence)
New per-row "Articles" button on the Sources table expands a read-only inline
panel of the source's ACTUAL ingested articles — so the automated metrics
(paywall/image/acceptance/duplicate) can be verified against evidence instead of
trusted blind. Distinct from "Check" (which re-samples the LIVE feed for
would-pass quality); this shows what's already in the DB, which is what the table
metrics are computed from.

- Backend: GET /api/admin/sources/{id}/articles?filter=&limit=&offset= (admin,
  read-only). queries.source_articles + source_articles_summary — per article:
  title, url, date, accepted, reason (the "why"), topic/flavor, paywalled
  (domain rule), has_image, duplicate. Summary = counts + source-level paywall
  rule.
- Frontend: expandable panel with a summary header ("27 ingested · 18 accepted
  · … · paywall rule: ON (domain)"), filter chips (All/Accepted/Rejected/No
  image/Duplicates), compact rows with title→link + badges + reason, Load more.

So "100% paywall" or "0% images" becomes clickable evidence: open two articles
to tell a real paywall from a mis-flagged domain, or a true image gap from an
enrichment failure. Test: test_source_articles_inspector. 241 pytest + 11 vitest.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-12 21:37:51 -04:00
thejayman77 64339aafb0 Games: in-progress hub status + distribution-aware word-search placement (Codex)
- Play hub: word cards now surface IN-PROGRESS games too (not just won/lost) so
  "continue on another device" shows at a glance — card reads "5:3…" and the
  selection option says "Continue · 3/6".
- Word Search generator: replace "prefer any crossing" with a SCORED placement —
  score = overlap*4 - local crowding (filled neighbours that aren't crossings) —
  then pick among the best ~20%. Keeps the organic interlocking but spreads words
  across the board instead of clumping around the first-placed (longest) words.
  Every word still placed (tests green). NOTE: changes today's grid layouts, so
  an in-progress word search resets once.

237 pytest + 11 vitest green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-12 15:18:04 -04:00
thejayman77 065ab98598 Games sync hardening (Codex audit): server-side state normalization
Don't trust client JSON at the storage layer:
- sanitize_game_state() runs before merge AND on the merged result (heals legacy
  rows). Word Search: keep only finds whose cells actually spell a real word in
  that day's grid (validated when the puzzle exists, shape-only 4-12 alpha +
  cell-length otherwise), dedupe, renumber ci. Word: validate status enum, guess
  count/length/alpha, colour-row shape, terminal answer/why.
- Completion is now derived from the real puzzle word count (foundWords ==
  expected), not a client-sent `ms` — so stats can't be inflated by junk.
- Date validated as YYYY-MM-DD at the API (400 otherwise) — no junk/future rows.

Tests: sanitizer-rejects-junk + bad-date 400; existing tests updated to use
real-shaped data (the sanitizer is a good forcing function). 237 pytest + 11
vitest green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-12 13:51:24 -04:00
thejayman77 dd0df64d76 Games: cross-device sync + overlap colour-blend
Two game polish items:

- Word Search: overlapping cells now multiply-blend the crossing words' colours
  (deepening to a darker shade with readable text) instead of the newest colour
  stomping the rest — matches the new interlocking grids.

- Cross-device game-state sync (signed-in): per-puzzle progress + stats now
  follow you between devices. New game_state table; server-side merge on every
  save so two devices converge regardless of push order, tailored per game:
  * Word Search → UNION of finds (monotonic; can't un-find), earliest start,
    best completion time.
  * Word → furthest-progress wins (terminal beats in-progress; more guesses
    beats fewer) — picks one device's game whole, never splices guesses.
  Stats (streak/distribution/best) derived server-side from the synced states,
  so they're consistent instead of per-device counters. Endpoints GET/PUT
  /api/games/state + GET /api/games/stats (signed-in; size-capped). Frontend is
  local-first: games paint instantly from localStorage, then reconcile in the
  background; both game components push debounced on each move and adopt the
  merge. Conflict handling unit-tested + an API two-device convergence test.

235→ tests + 11 vitest green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-12 13:35:20 -04:00
thejayman77 2ef0efd909 Perf: skip needless dedup re-cluster + interlock word-search grids
Two things found while chasing the recurring ~15min slowness:

- dedup.py: cluster_duplicates re-ran an O(n²) cosine pass over ALL ~3.7k
  articles and rewrote duplicate_of for every one of them EVERY cycle — even
  when nothing new arrived (embedded=0) — ~53s CPU + a large WAL commit that
  starved live API reads (/api/brief 2-7s). Now skip the re-cluster entirely
  when nothing new was embedded (clusters can't have changed). Verified: cycle
  drops from ~53s to ~1s and /api/brief stays at 20ms through a cycle, vs 2-7s
  before. (A real new article still triggers a full re-cluster.)

- games.py _build_grid: word placement took the first random valid spot, so
  words rarely crossed. Now gather valid placements and PREFER ones that cross
  an already-placed word (shared matching letter), falling back to any valid
  spot — so the grid interlocks like a real word search. Every word still
  placed (tests green). NOTE: changes today's grid layouts, so an in-progress
  word search resets once.

Also added a systemd drop-in (Nice=19/CPUWeight=20/IOWeight=10/ionice-idle) to
deprioritize the batch cycle — minor, the dedup skip is the real fix.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-12 12:35:01 -04:00
thejayman77 ecf879fd1b Perf: parallelize admin loads + edge-cache /api/brief
Two concrete latency wins found by measuring (server compute is 2-17ms; the time
is in the path, not the box):
- Admin panel fired its 6 API calls SEQUENTIALLY (await chain) — so it paid the
  uncached origin round-trip six times back-to-back. Now one Promise.all batch.
  This is the admin lag.
- /api/brief (the home "Gathering the good news…" content) wasn't edge-cached, so
  a distant anonymous visitor triggered a Cloudflare→residential-origin pull.
  Same global/shareable boundary as /api/feed: public s-maxage=45 when no
  prefs/exclude, else private,no-store. (Needs /api/brief added to the CF cache
  rule path list to take effect at the edge.)

Tests: test_brief_cache_boundary. 228 pytest + 11 vitest.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-12 09:40:57 -04:00
thejayman77 a34a47fe22 API: edge-cacheable headers for global startup endpoints ("Gathering" speedup)
"Gathering the good news…" waits on the home's startup API calls, which were all
DYNAMIC → a round-trip to the residential origin every load (the occasional 2-3s
linger). These responses depend only on the URL, never the session, so they're
safe to share at the edge:
- /api/moods, /api/categories (static config) → public, s-maxage=900
- /api/lanes, /api/families (global, data-derived counts) → public, s-maxage=120
- /api/feed → public, s-maxage=45 ONLY when shareable (no following / prefs /
  exclude); the following feed (reads the session) and personal filters stay
  private, no-store.

Hard personalization boundary, explicit per-endpoint (no blanket /api/* rule).
Pairs with a Cloudflare cache rule (added separately) making these paths
eligible. Tests assert the global endpoints are public+s-maxage and the feed
boundary (default/topic public; following/prefs/exclude private). 227 pytest.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-12 04:34:11 -04:00
thejayman77 c4ea329f9b Candidate rename hardening (Codex): pending-only + length cap
Two small server-side tweaks so the endpoint matches the UI policy:
- Rename is refused (409) for promoted/rejected candidates — they're settled
  history; the UI already hides Rename for them, now the server enforces it too.
- Name is capped at 160 chars before save, so an accidental pasted paragraph
  can't wreck the queue layout.

Tests extended: 300-char name truncates to 160; renaming a promoted candidate
→ 409. 225 pytest + 11 vitest green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 21:55:38 -04:00
thejayman77 070b40584e Candidates: inline rename (fix a name typo without reject + re-add)
A staged candidate could only be renamed by rejecting and re-adding it, which
churns the queue and discards the preview just to fix a typo. Add an inline
Rename on each candidate: a "Rename" pill swaps the name for an input
(Enter saves · Esc cancels), POST /api/admin/candidates/{id}/rename →
sources.rename_candidate(). Empty clears the name (promote then derives one
from the feed host). Preview is preserved; the fixed name carries into promotion.

Tests: test_candidate_rename (rename in place keeps preview, promotes with the
new name, gated + 404). 225 pytest + 11 vitest green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 21:39:13 -04:00
thejayman77 3afc1ed37e Sources hardening (Codex audit): promote-time dedup, postJSON timeout, host-only feed_key
Three follow-ups from Codex's audit of the deep-preview/search/dedup work:
- Promote-time duplicate guard: promote_candidate() now re-checks
  find_existing_feed() and raises DuplicateFeedError → 409, so an
  old/CLI/direct-DB candidate or a race can't bypass the add-time check and
  silently overwrite a live source's settings via upsert. (sources scanned
  first, so a real source collision wins over the candidate matching itself.)
- postJSON/putJSON/delJSON gain opt-in {timeout} (AbortController, default
  none so other calls are unchanged); deep preview uses 120s and surfaces a
  calm "timed out" message instead of pinning the button on "Deep-checking…"
  if the LAN model stalls.
- feed_key() now lowercases the host only, not the whole URL — paths/queries
  can be case-significant; scheme/www/trailing-slash/host-case still collapse.

Tests: test_candidate_deep_preview_and_dedup extended — promote succeeds once,
then a re-promote of the same candidate is refused 409. 224 pytest + 11 vitest.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 21:31:39 -04:00
thejayman77 e1ac19351e Sources: LLM deep-preview, source search, duplicate-add guard
Three admin Sources upgrades:
- Deep preview: a per-candidate "🔬 Deep preview" button runs the REAL
  classifier on an 8-item sample (the same model that judges live articles),
  versus the fast keyword heuristic the add/Re-preview path uses. Preview now
  carries `classified`, surfaced as a "model-checked" vs "quick estimate"
  badge — so the acceptance % is no longer ambiguously heuristic. conn is
  released during the ~30-60s model pass; postJSON has no client timeout.
- Search: free-text box over the sources table (name / category / feed URL /
  homepage), folded into the existing status filter, with a live match count
  and empty state. Makes "is this already added?" a glance.
- Duplicate-add guard: sources.find_existing_feed() + feed_key() normalize
  scheme/www/trailing-slash/case, so re-adding a feed that's already a live
  source or a queued candidate is refused with a 409 naming where it lives
  (DB already enforced exact-URL uniqueness; this catches the near-miss
  variants and overwrite-on-promote footgun).

Tests: test_candidate_deep_preview_and_dedup (deep flag wires the model +
uses the small sample; exact/www/slash/case variants all 409). 224 pytest +
11 vitest green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 21:19:15 -04:00