upbeatBytes

Author	SHA1	Message	Date
thejayman77	2dc4419024	images/analytics: purge on policy revoke + engagement warm-up note (Codex close-out) - newsimg.purge_source(): when a source leaves 'cache' (permission revoked / re-classified), the admin image-policy endpoint now deletes that source's re-hosted copies immediately, rather than leaving them inaccessible-but-on-disk. Endpoint returns {purged}. - Admin "Engaged readers" carries a warm-up note: tracking began 2026-06-30, so low rolling windows are partly warm-up, not all bots (compare d7 after a week, the window after its full span). Guards against misreading "6 engaged vs 135 visits" as 129 bots. Tests: purge_source removes only the target source's copies; endpoint reports purged. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-30 14:29:55 -04:00
thejayman77	f416e13700	analytics: honest engagement metric — Engaged readers vs Recorded visits (Codex) Admin now shows two numbers: - Recorded visits: the existing raw count (one daily 'visit' beacon; still includes UA-spoofing bots that slip past the UA filter). - Engaged readers: distinct visitor-day with DELIBERATE activity — either the new gesture-gated 'engaged' beacon (fires once/day only after ~8s visible AND a real scroll/pointer/key/touch) or a deliberate action (source_click, full_story, share, replace_used, paywall_replace, not_today/less_like_this/hide_topic, game start/ complete/share). Explicitly EXCLUDES auto-fired visit/summary_viewed/open, replace_none, and game *_arrival (a share-loop landing, not engagement). armEngaged() in analytics.js (wired in the global layout) + a mirrored vanilla-JS beacon on the server-rendered /a/<id> share pages. 'engaged' added to the event allowlist and fired with article_id=0 so the uniqueness constraint dedups it per day. queries.admin_stats gains engaged_today/d7/d30. Bots are doubly excluded (UA filter at the beacon + the gesture gate). Tests cover the metric (engaged + deliberate counted; visit/summary/arrival not). 447 backend + 36 frontend tests pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-30 14:07:24 -04:00
thejayman77	8a7606e20d	images: fix two fetcher bugs + add source-level image-rights policy (Codex) Fetcher (the two remaining bugs Codex found): - Real redirects are now followed. _NoRedirect makes urllib RAISE HTTPError on 3xx, so the old status-branch was dead code (mocked tests masked it). Handle 301/302/303/307/308 HTTPError as redirects (re-validate the destination); classify 4xx≠429 as PERMANENT (negative-cached), 429/5xx/network as transient. Real-opener redirect + 404/5xx tests. - The megapixel ceiling is now enforced: explicit `w*h > _MAX_PIXELS` check BEFORE load() (Pillow only warns at MAX_IMAGE_PIXELS). Test with a lowered ceiling. Image-rights policy (per Codex + owner decision — only cache what's cleared): - sources.image_policy: 'cache' (re-host a downscaled copy — license/permission/PD only), 'remote' (hotlink the publisher's image — the conservative DEFAULT), 'none' (no image). - newsimg.display_url resolves the display URL per policy; applied in Article.from_row so feed/brief/history return the right URL, and in share.py (og/twitter still reference the publisher's own image, never re-hosted). warm() + /api/img both gated on 'cache'. - Frontend uses the server-resolved image_url (reverted the hardcoded /api/img); the graceful retry covers remote hotlinks too. Admin: per-source image-policy selector + POST /api/admin/sources/{id}/image-policy. Default 'remote' → nothing re-hosted until a source is explicitly cleared. 445 backend + 36 frontend tests pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-30 14:01:11 -04:00
thejayman77	a55ba185a8	images: harden the cache per Codex audit (SSRF-safe, cache-only endpoint, WebP-only) Blocker fixes for the image cache: - /api/img/{id} now serves cache HITS ONLY and is restricted to ACCEPTED, CANONICAL articles. It never fetches — the cycle (newsimg.warm) owns all fetching — so the public endpoint has no SSRF/worker-exhaustion surface. Dropped 1-year immutable caching (image_url can change) → public, max-age=86400. - newsimg._safe_fetch: SSRF-safe (reuses enrich._host_is_public + _NoRedirect, http(s) only, every redirect hop re-validated, body capped). _FetchError distinguishes permanent refusals (negative-cached via a .fail marker) from transient errors (retry). - _encode re-encodes only decoded RASTER images to WebP and REJECTS everything else (SVG, undecodable, decompression bombs via MAX_IMAGE_PIXELS, pathological dimensions); originals are never retained. prune() also sweeps stale .fail markers. - Concurrency: fetching only runs inside the cycle lock; writes stay atomic. Smaller fixes: - share.py visible image has onerror→this.remove() (degrade to the text unfurl, no broken icon when an image isn't cached yet). - share-page Back follows history only on a SAME-ORIGIN referrer (never bounce to an external site); menu now honors Escape + resets crossing back to desktop (HubBar parity). Tests: private host, redirect-to-private, hostile SVG/non-image, transient-vs-permanent failure, LRU prune, warm (accepted+canonical only, idempotent), cache-only endpoint (404 on not-cached/unaccepted/duplicate, never fetches), share chrome parity. 441 pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-30 12:19:57 -04:00
thejayman77	ee43bb0df6	analytics: filter known-bot User-Agents at /api/events (honest visitor counts) Many modern crawlers (AI scrapers, headless Chrome, link-preview fetchers) run JS and fire the visit/summary_viewed beacon, inflating "visitors" even though there's no human discovery channel. Apply queries.is_bot_ua() at /api/events — the same filter the load-error beacon uses — so honest bot UAs (GPTBot, AhrefsBot, headless Chrome, python/curl, …) are dropped before recording. Response is identical so a bot can't detect it. Counts read lower but truer going forward (past rows unchanged). Won't catch UA-spoofing bots; that needs a heavier heuristic. Tests: bot UAs dropped, real browser counted; existing event tests send a real UA (default client UA contains "python"). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-30 11:19:51 -04:00
thejayman77	86d9897113	ui: reserve the scrollbar gutter so the top bar stops shifting between pages Pages tall enough to scroll showed a ~15px scrollbar; short pages didn't — so the centered top bar jumped left/right as you navigated. scrollbar-gutter: stable on html (SPA app.css + the server-rendered share pages) keeps the layout width constant. No-op on overlay-scrollbar platforms (mobile), which never shifted. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-30 04:52:59 -04:00
thejayman77	3740e09d02	share pages: carry the real HubBar toolbar (consistency with the SPA) The server-rendered /a/<id> and digest pages predated "HubBar everywhere" and showed a stripped bar (logo + a bespoke Back pill). They can't run the Svelte component, so add a hand-kept static replica of HubBar (logo + News/Play/Art nav + account glyph + mobile burger/drop-panel) plus HubShell's borderless ← Back. A signed-in reader's avatar paints from the same localStorage cache HubBar uses. /a/<id> now looks like any detail page (/art, /word). Reusable _top_bar_html/_TOP_BAR_CSS/_TOP_BAR_JS/_back_link helpers; applied to both share pages. Kept in sync with HubBar.svelte by hand (noted). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-30 04:26:01 -04:00
thejayman77	8a3c00db3b	images: cache + serve article images from our own origin (bounded, LRU-evicted) Stop hotlinking news images from third-party CDNs (the source of the "blank until you refresh a few times" graphic). New goodnews/newsimg.py caches a downscaled WebP display copy (≤800px) beside the DB, like art_cache: - GET/HEAD /api/img/{article_id} — resolves id→image_url (allowlisted to our corpus, not an open proxy), fetch+cache on first miss, serve local after, immutable headers. - cycle warms display copies for recent accepted-with-image articles (so the FIRST view is already local) and prunes to a hard size cap (default 1 GB) by LRU eviction. Frontend now points at /api/img/<id>: the hub lead, every ArticleCard (feed hero + cards), and the /a/<id> share page's visible image. og:image/twitter:image stay the source URL so social crawlers fetch the canonical image directly. Storage is bounded by construction — over the cap, least-recently-used files are evicted, so it can't grow without limit regardless of ingest rate. Tests cover fetch/downscale, cache-hit (no refetch), bad-scheme/non-image rejection, fetch failure, LRU prune, warm, and the endpoint allowlist. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-29 20:28:33 -04:00
thejayman77	d98cec9ded	admin: read/unread triage for load errors (unread by default, mark read/all) The load-error log had no way to clear reviewed entries. Add a read_at column to client_errors and a read/unread model mirroring the feedback inbox: - GET /api/admin/client-errors?show=unread\|read\|all (default unread; returns id+read) - POST /api/admin/client-errors/read-all (mark all unread read) - POST /api/admin/client-errors/{id}/read {read: bool} (per-row toggle) Headline stat is now "Unread load errors" (admin_stats.client_errors.unread), so the red badge clears as you triage. Admin UI: Unread/Read/All tabs, a "Mark all read" button, and a per-row ✓/↩ toggle; reading an entry drops it from the default view. 14-day auto-prune still bounds the table. Tests cover filter, toggle, mark-all, 404, and gating. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-29 10:38:22 -04:00
thejayman77	0ae789752e	fix: QOTD/WOTD freshness — pick within the freshest cohort, not the rotated pool Both selectors ordered candidates least-recently-shown, then daily.seeded_order() ROTATED the whole list and took [0] — an arbitrary date-hashed item, undoing the ordering. Result: repeats (quote id 2 on 6/28+6/29; word "harmony" on 6/25+6/28), no guarantee a pool item is shown before it recurs. Fix: daily.freshest(rows) returns the freshest cohort only — every NEVER-shown item while any remain, else the oldest-shown group. quote/wotd _candidates use it; seeded_order now picks deterministically WITHIN that cohort. So every pool item is featured once before any repeat, then cycles oldest-first. Dropped the unused _NO_REPEAT_POOL window. Tests: no-repeat-until-exhausted (quote + wotd) + a freshest() unit test. 428 backend tests green. (Separate follow-up: expand the QOTD pool from 16 → 90+ vetted public-domain quotes for a longer no-repeat window.) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-29 05:39:06 -04:00
thejayman77	667b1a82c3	brand: standardize "Upbeat Bytes" → "upbeatBytes" everywhere Per the logo + brand: the name is upbeatBytes (camelCase). Swept all user-facing strings — titles/og:site_name/og:title, logo alt text, share pages (share.py), emails (email_send), classifier prompt (llm), digest/unsubscribe (api), PWA manifest, game share text, sign-in, the SPA shell + patch-static-heads (play title) — plus README/publish.sh and the email test fixture. (SMTP From env was already upbeatBytes.) Domains (upbeatbytes.com) unchanged. 425 BE + 36 FE green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-28 20:01:20 -04:00
thejayman77	2cfffdfd6a	NEWS RELAUNCH CUTOVER: promote the hub to /, feed to /news, go public The big flip. /home3 (hub) becomes /; the feed lives at /news; both indexable. - PROMOTE: routes/+page.svelte is now the hub (was the interim NewsFeed wrapper); noindex removed; "Read more good news" → /news. routes/home3 + home2 deleted. - routes/+page.js: redirects legacy root-query links (/?view=latest, /?tag, /?source, /?q, /?view=today→highlights) to /news before the hub renders (no flash). - /news: noindex dropped (route meta + Caddy @newsHidden removed); now public. - LINKS: HubBar brand/Home → /, News default → /news; HubShell/art/play back → /; account Following + share.py Explore/Browse/source → /news. - FOOTER: one shared Footer.svelte (motto + Send feedback + slot) across Hub/News/ Play/Art/HubShell/Account/Zen; global layout footer removed (FeedbackModal stays). - SITEMAP: + /news /art /play /word /quote /onthisday; cap 5k→50k; gated on has-summary; paywalled excluded; HEAD now 200 (api_route GET+HEAD). - Head-patcher: /news entry. PWA + shell description broadened to the hub. - Caddy: @newsHidden dropped; @hidden now admin-only (word/quote/onthisday public); /home2,/home3 → / 301. Mirrored to deploy/caddy snapshot. 425 backend + 36 frontend tests green; build clean; Caddy valid. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-28 19:16:43 -04:00
thejayman77	1c1ecefde8	news: harden paywall exclusion at the candidate query + add the missing regressions Codex's two non-blocking hardening items, folded in before cutover: - _candidate_articles() now excludes paywalled sources IN-QUERY (before LIMIT 50), so flagged stories can't consume candidate slots and leave a full brief thin. Dropped the now-redundant post-fetch filter in build_daily_brief. - Regressions: history retains a viewed paywalled article; sitemap omits a paywalled source AND restores it under override="free". - Aligned test_brief_paywall to the source-level model (paywalled sources carry a paywalled homepage, as in production) — it had relied on article-URL detection. 425 backend tests green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-28 18:54:53 -04:00
thejayman77	c600145ba5	news: close the remaining no-paywall bypass paths (Codex audit) queries.feed was the main chokepoint, but several discovery paths have their own SQL. Apply the shared source exclusion to all of them so "no paywalls" is truly site-wide: - briefs.build_daily_brief: EXCLUDE paywalled candidates (was: demote) — never stored in a new brief. - queries.brief: stored-brief retrieval (covers /today + /api/brief) filters the paywalled source. - digest.digest_items + followed_digest_items: the morning email + "from what you follow" omit paywalled sources. - sitemap(): paywalled article pages excluded from the sitemap. All reuse queries.paywalled_source_ids (admin override still wins). Regression tests (test_paywall_exclusion.py): never stored in a new brief; /today + digest omit it; followed-source email omits it; Saved retains it; 'free' override restores eligibility. 423 backend tests green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-28 17:22:52 -04:00
thejayman77	0d21231597	news: hard-exclude paywalled sources from the feed + brief (no unreadable news) Per Jay: don't surface stories people can't read without paying — it's off-brand ("no paywalls") and pointless. Paywalled is source-level (domain rule, admin- overridable): just 3 sources today (Nature, New Scientist, MIT Tech Review), ~5.4% of accepted articles. - queries.paywalled_source_ids(conn): live source set (admin override wins). - queries.feed gains include_paywalled=False (default) → adds `a.source_id NOT IN (…)`. One chokepoint covers Latest/tags/sources/moods/topics/search/since AND the brief top-up. Source-level + SQL → paging stays exact, no frontend change. - brief(): filter the cached/home pool by the same rule; replacement already avoids paywalled and now rides the feed exclusion too. - Dropped the now-moot "paywalled below readable" demotion sort. - Saved/history keep showing items you saved (their own queries, not excluded). - test_source_paywall_override updated: paywalled source → excluded from the feed (was: shown with a badge); 'free' override → returns, no badge. 418 tests green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-28 17:10:00 -04:00
thejayman77	6c10ad99a9	On This Day: serve sharp images (originalimage, not the 330px thumbnail) The Wikimedia feed's thumbnail is 330px, which upscales blurry in our hero. Use originalimage.source instead — it's reliably sharp. (Can't just request a bigger thumbnail width: for very large source images Wikimedia only serves pre-generated bucket sizes and 400s on arbitrary widths — e.g. 500px ok, 800/1024px fail.) - onthisday._best_image() prefers originalimage, falls back to the thumbnail. - scripts/otd_image_upsize_backfill.py re-fetches each stored MM-DD and upgrades image_url in onthisday_pool + daily_onthisday in place (ran on host: pool + 6 daily rows now sharp; today's hero verified 200). Only the /onthisday hero loads this image (home card is text-only), so larger files are a single-page, one-time load. - test_best_image locks the prefer-original/fallback behavior. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-27 17:07:37 -04:00
thejayman77	ed814c97b9	Daily Art engine: museum-guide blurb (grounded LLM) + extracted palette - daily_art gains blurb + palette columns (idempotent migration). - art._palette: Pillow median-cut to ~5 hex colors from the cached image (best- effort → [] on any failure). art._blurb: a warm 2-3 sentence "what you're looking at" note grounded in the Met catalogue (title/artist/bio/date/medium/ classification/culture/tags). Prompt leans on context/significance and the title+tags for subject — explicitly NOT asserting literal composition (figure counts/poses) it can't see, since the model can't view the image. Markdown stripped from the output. - pick_daily generates both (client optional → blurb skipped when absent); cycle + art CLI pass an LLM client. /api/art/today exposes blurb + palette. - Backfilled the last 3 days on host (Veteran / Magnolia Vase / Bierstadt). - scripts/art_blurb_palette_backfill.py for in-place backfill (no re-pick). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-23 20:12:54 -04:00
thejayman77	dc23277b38	Read-time: full-article "Full story · ~N min" badge (Option B) Replaces the gist-based read-time with the SOURCE article's full read time — the contrast that sells the gist ("calm 1-min version here; ~10 min for the deep dive"). - goodnews/readtime.py: word_count_from_html (strips script/style/nav/header/ footer/form/button/aside furniture before counting) + source_read_minutes (~225 wpm, 200-word floor, None when extraction looks failed/too thin). - articles.source_words + read_checked_at columns (count only, never the body; fits the privacy posture). Idempotent migration. - enrich.fetch_source_words + enrich_read_times: a bounded, retry-guarded cycle step (mirrors the image enrichers) that counts words for recent accepted articles. Only ever writes a real count; never overwrites good with zero. Wired into the cycle after recent-image enrichment. - queries: source_words flows through _ARTICLE_COLUMNS; api exposes source_read_minutes on Article (null when unknown). - home3: News card shows "Full story · ~N min", hidden entirely when null (no misleading "1 min"). - Tests: furniture stripping, threshold/rounding, enrich idempotency + no zero-overwrite, API null handling. 412 backend. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-23 08:09:00 -04:00
thejayman77	bdf3b1f47b	WOTD _polish: enforce the "examples must use the word" contract Per Codex audit: only accept a polish when there's a gloss AND at least one example sentence that actually contains the word (case-insensitive). Examples that don't use the word are dropped; if none remain, fall back to the raw dictionary def/examples instead of shipping a gloss with empty/irrelevant usage. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-23 07:56:22 -04:00
thejayman77	cebbed58ab	WOTD #4/#5 content quality + Editorial Asymmetric /word page (CD) Content quality ("LLM polishes, dictionary anchors"): - New wotd._polish: rewrites the real dictionary gloss into ONE warm plain sentence + two clear everyday example sentences, grounded in the real definition (no invented meanings). Stored in new wotd_pool/daily_wotd columns gloss + usage, alongside the raw definition/examples which stay the anchor. - harvest() polishes each new word; pick_daily() lazily polishes + caches back any older pooled word that lacks a gloss (client threaded through run_daily). - Admin word-add polishes on insert; re-pick passes an LLM client so quote meaning / word gloss fill on a forced fresh pick. - /api/word/today now prefers gloss + usage, falling back to the raw dictionary def/examples when polish is absent (so it's always safe). - db._migrate adds gloss/usage to wotd_pool + daily_wotd (idempotent ALTER). Frontend — /word redesigned to CD's "Editorial Asymmetric": faded oversized initial bleeding off the right, vertical part-of-speech rail, big Newsreader word, airy definition, left-ruled italic example sentences, outline Listen button + date. (Uses our self-hosted Newsreader/Hanken stack rather than the mockup's Google fonts; the made-up syllable respelling is omitted since we only have real IPA.) Tests: _polish parse/trim/cap, harvest stores gloss/usage, pick lazy-polishes older words, admin gloss flows through to /api/word/today. 403 backend + 27 fe. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-23 06:08:14 -04:00
thejayman77	84b1fb514f	Small joys: Codex audit #2 fixes (route resolution, noindex, sense/tone, exclude-current re-pick) - Admin joy item route moved to /api/admin/joys/{kind}/items/{item_id} so the /add and /repick verbs resolve to their own routes instead of 422-ing as a non-int item id (the launch blocker). Frontend mutate URL updated to match. - Re-pick now excludes the currently-shown item: the endpoint reads today's daily pool_id and passes it as `avoid`, so "Re-pick today" yields a different item. Added `avoid` to pick_daily/_candidates across wotd/quote/onthisday. - WOTD sense selection: the LLM now proposes word + intended part of speech, and _lookup prefers that sense (fixes "serene" returning the archaic noun). - On This Day tone prompt tightened to favor genuinely uplifting events and exclude merely procedural/political-administrative ones. - Caddy @hidden now also noindexes /word /quote /onthisday /admin (+ .html). - Regression tests: add/repick resolve (401 not 422), add/feature/block/delete, re-pick excludes current; WOTD pos-preference + proposal parsing units. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-22 20:19:02 -04:00
thejayman77	3bde6534e9	Small joys: wire homepage rail to live data + rich pages (/word /quote /onthisday) + admin - /home3: small-joys rail now reads live /api/word\|quote\|onthisday/today (placeholders only as fallback); each cell links to its detail page. - HubShell component (shared bar/footer/fonts/tokens) for the hub + detail pages. - /word: big word, IPA, Listen (cached clip + browser-TTS fallback), definition, sentences. - /quote: the quote, attribution, and the AI "what it means". - /onthisday: the date, year + fact, image, summary, source. - Admin "Small Joys" tab: per-pool list with feature/block/delete/add + re-pick, for all three kinds. New admin API: GET/POST /api/admin/joys/{kind}[/{id}\|/add\|/repick]. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-22 18:52:38 -04:00
thejayman77	67d4bc32cb	Small joys: Quote of the Day + Word of the Day engines - quote.py: curated public-domain quote pool (16 seeded, admin-grows), deterministic daily pick, lazy AI "what it means" explainer of the real quote (cached). No LLM-invented quotes. - wotd.py: LLM proposes positive words → validated/enriched against dictionaryapi.dev (real definition, IPA, examples, audio) → audio clip cached to our origin (TTS fallback) → deterministic daily pick. Tops the pool up toward 30/day. - db.py: quote_pool/daily_quote + wotd_pool/daily_wotd tables. - api.py: /api/quote/today, /api/word/today, /api/word/audio/{word} (GET+HEAD). - cli.py: cycle steps for both (under --no-joys), shared LLM client. - tests: test_quote.py (6) + test_wotd.py (5). 393 backend tests green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-22 17:28:55 -04:00
thejayman77	a7da8362ab	Small joys backend: shared daily framework + On This Day engine - goodnews/daily.py: shared helpers for the daily "small joys" (http_json, date-seeded deterministic pick, dedup key) so each joy is a small self-contained module. - goodnews/onthisday.py: harvest today's MM-DD from Wikimedia's On-this-day feed → tone-filter to good/neutral (keyword floor + optional LLM refine) → pool → deterministic daily pick (idempotent, respects blocked/featured) → cached row. Network/LLM before any DB write. Multi-source ready (source column). - db.py: onthisday_pool + daily_onthisday tables. - api.py: GET /api/onthisday/today (edge-cacheable). - cli.py: cycle step (run after Daily Art; --no-joys to skip), LLM client for tone refine. - tests/test_onthisday.py: 7 tests (filter+dedup, pick idempotent, blocked/featured, never-empty, empty-pool, LLM-narrow). 382 backend tests green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-22 16:51:29 -04:00
thejayman77	dd8706e2fc	Art post-audit polish (Codex): image HEAD, texture immutable cache, lightbox a11y, spacing - /api/art/image/{id} now answers HEAD as well as GET (was 404 on HEAD) — mirrors the /a/{id} fix. Added tests/test_art_api.py (GET+HEAD+size=full fallback + today payload). - /textures/* served immutable (long cache) instead of no-cache; excluded from the revalidate matcher. Live Caddyfile + repo snapshot both updated. - Lightbox: Escape closes it, and focus moves to it on open (keyboard-friendly). - Trimmed the gallery's top padding so "Daily Art" sits closer to the bar. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-21 18:17:30 -04:00
thejayman77	27788ba2a8	Art page round 2: virtual frames, real logo, hi-res zoom, spacing/affordance polish - Virtual frames (Walnut/Gold/Silver/None), selectable + remembered in localStorage, built as a beveled moulding around a cream museum mat. - Header uses the real /logo.svg wordmark; the "No ads" pill is replaced by an account icon (the pill doesn't need to follow every page). - Lightbox now opens a full-resolution copy that fills the screen: art._download_image caches a hi-res {id}-full copy alongside the web-large display copy, served via /api/art/image/{id}?size=full (image_url_large in /api/art/today). - Centered the placard bullet separators (explicit .sep spans, equal margins). - Image no longer shifts on hover; a quiet "Click to expand" affordance sits on the art. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-21 16:25:31 -04:00
thejayman77	db967bb7fa	Daily Art: Codex guardrails (atomic image, attribution/license, blocked lever) Hardening before it runs further on the cycle: - DB-lock/network: all HTTP (metadata + image) happens before any write; the write txn opens only at the brief INSERT and commits immediately. Images download to a temp file then atomic os.replace into cache (a reader never sees a half-written file). - Site-timezone "daily" already used local_today() (same rhythm as the Brief) — confirmed. - Attribution from day one: store + return title/artist/date/medium/department/credit/ source_url/object_id/source + museum name + is_public_domain license marker + the full- res source URL (for a richer /art view later). UI can show: Title · Artist · The Met. - "highlight != always beautiful": added a manual `blocked` flag on art_pool (excluded from picks) as the cheap curation lever; a featured override can follow. Schema migrated (existing art tables get the new columns). 373 tests green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-21 15:28:38 -04:00
thejayman77	308516a263	Daily Art backend: curated Met pool, daily cached pick, /api/art (prototype) The engine for the /art room (design-independent; deploy held for Codex review). - goodnews/art.py: harvest a curated pool of public-domain HIGHLIGHT artworks from the Met (isHighlight+isPublicDomain+hasImages -> masterworks, never potsherds; CC0). Daily deterministic pick from the least-recently-shown (no soon-repeats, same for everyone), fetch metadata + download the image to OUR cache (data/art_cache) so the homepage never waits on or hotlinks the museum. Bulletproof: bad object/image falls through candidates; a failed day keeps the last piece (room never empty). Injectable HTTP for tests. - Schema: art_pool + daily_art. /api/art/today (edge-cacheable) + /api/art/image/{id} (served from cache, immutable). CLI `art [--harvest] [--force]` + a non-fatal cycle step. - Tests (5, mocked HTTP) + verified live against the Met: harvested 1641 works, picked/cached "Repose" by John White Alexander. 371 tests green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-21 14:50:20 -04:00
thejayman77	0c68c22221	Brand consistency: emails say "upbeatBytes" (From + digest body) Per the brand-name standard (camelCase, one word). Updated the SMTP From default and the digest email body/subject strings. Live env From values (auth.env + goodnews.env) updated to match. (Web/OG brand strings in share.py + app.html are the remaining sweep.) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-21 11:38:16 -04:00
thejayman77	b4b02b5050	Scope dial polish (Codex): hero stays closest-first + visible Clear - Hero constraint: _pick_lead now runs only within the CLOSEST non-empty section of a personalized Brief, so a "gentler" wider-region/world story can never be floated into the hero slot above a local one. Only widens if the closest section is empty. - Dial gains a visible Clear (alongside Change) so a reader never feels locked into personalization; "World" stays the keep-home-but-go-global option. 366 tests green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-19 22:06:06 -04:00
thejayman77	3486f3102a	Scope dial v2: Nearby / Region / Country / World radius on the homepage Codex-approved evolution: the reader controls the "emotional radius" of the landing. - Census-region "Regional" grain (geo.region_of / region_states). Scope-aware tiering (queries.home_tiers): closest->widest lead, confidence-gated on state + region, never a hard filter — blends outward so the set is always full. 'world' = the global brief. - queries.home_brief takes a scope; /api/brief gains a scope param (nearby\|region\| country\|world). Country-only / non-US homes collapse to country. - Homepage dial replaces the 2-button toggle: adaptive stops (4 with a US state, else Country/World), persisted scope, "Good news closest first" framing. Concrete, soft section labels (Around New Jersey / Across the Northeast / Across the US / Around the world) so the reader sees the dial worked. Backend 366 + frontend tests green. (Latest feed still on v1 local-first; aligning it to the dial is the immediate follow-up.) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-19 21:59:32 -04:00
thejayman77	d2a6293a13	Local-first Brief: the landing leads with good news from your home Per the owner's call (overrides the earlier "Brief sacred" stance): when a home is set, the homepage opens with local good news first, not global. This is the hook — you land and see awesome stories from YOUR corner first. - queries.home_brief: local-first highlights (high/medium-confidence near, blended out to country then world so it's always a full, strong set), preferring already- summarized stories so the calm read stays rich. Recent window, ranked within tier. - /api/brief gains a `home` param: private/no-store when set; over-fetches + caps so dismissal/boundary filtering never thins it; falls back to global top-up if needed. - Landing UI: a Local <-> Global toggle ("📍 Near you / 🌍 Everywhere") when a home is set, the calm picker invite when not (dismissible), and Change. Default leads local; one tap back to the global brief. No home set => exactly today's behavior. Backend + frontend tests green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-19 21:36:18 -04:00
thejayman77	2239549799	Closer to Home: gate "Near you" on high/medium confidence (both modes) Codex polish before deploy: anything elevated as Near you / Close to home must have geo_confidence in (high, medium) — the feature's promise is relevance. Country-only mode now gates "near" too; since it has no "country" tier, the "world" scope is widened to absorb low-confidence home-country stories so they surface there instead of vanishing between tiers (the same edge-case class, fixed). State mode unchanged. 364 tests green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-19 20:29:31 -04:00
thejayman77	e7e8f5515e	Geo Stage 4 (server): home-aware feed sectioning (Near you / country / world) Completes the server side of "Closer to Home". /api/feed gains a `home` param ('US' or 'US-NY'); when set the response is private (like prefs) and sectioned: - Near you (+ Elsewhere in your country when a state is set) is a ONE-TIME lead block on page 0; the world is the paginated body. next_offset tells the client where to continue, so the lead block never skews world paging. - Thin tiers fold down (MIN_TIER=3) so a header is never shown empty (lead, don't trap). - State match counts only on high/medium geo confidence; the "country" tier excludes exactly what went to "near", so a low-confidence home-state story still surfaces (it doesn't vanish between tiers — caught + tested). - Items carry a `section` tag; paywalled sort is now within-section. No home => exact prior behavior (section null, default/edge-cached feed unchanged), Brief untouched. 364 tests green. Frontend next: Home picker + sectioned feed rendering. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-19 19:35:22 -04:00
thejayman77	ad4e88c8f2	Geo Stage 4 (data layer): geo on feed responses + home-scope query filters Foundation for "Closer to Home" (server-side, Codex-approved). No behavior change yet — geo_scope defaults None, so the default/edge-cached feed is identical. - queries.feed now returns each article's geo (breadth, confidence, and ISO-coded places) via a LEFT JOIN + places subquery. Article.from_row parses geo_places into [{country, state}]. Brief query doesn't select geo, so the Brief stays bare. - queries.feed gains home-scope filters (home_country/home_state/geo_scope = near\|country\|world): STATE match only counts on high/medium geo confidence; untagged articles fall to 'world' so nothing is lost during backfill. Next: API composition (home param + near/country/world sectioning with soft/blended headers + a next_offset pagination model) and the Home picker UI. 360 tests green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-19 19:30:43 -04:00
thejayman77	1c05554a28	Geo Stage 1-2: subject-geography model + classifier + pipeline wiring "Closer to Home" foundation (audit greenlit by Codex). Durable geography, kept decoupled from volatile scoring. - Schema: article_geo (breadth/confidence/rationale/geo_version) + article_places (0..N ISO-coded places), separate from article_scores so re-runs/audits never disturb scoring or acceptance. "local" is never stored — it's relative to the reader; the UI computes "Near you" later. - geo.py: LLM proposes place NAMES, code disposes to ISO codes (country alpha-2, US state 2-letter); region words like "Europe" can never become a country. 'global'/placeless is first-class, not failure. Confidence calibrated so 'high' needs an explicit location. Geo is its OWN LLM pass, not merged into the scoring prompt (durable metadata, re-runnable, keeps the sensitive prompt untouched). - store_geo replaces places (geo is re-derivable, unlike scores). tag_articles is idempotent by geo_version, only touches accepted non-duplicate articles. - CLI `geo` command (cycle-locked, --limit/--reclassify) for backfill, plus a bounded geo step in the cycle (--geo-limit 60, --no-geo). scripts/geo_audit.py is the prototype audit tool. 360 tests green; live smoke tagged real articles correctly (Gaza->PS, London->GB, placeless science->global). No UI / SEO pages yet — ranking/personalization only. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-19 16:56:49 -04:00
thejayman77	59ff48ae90	Game share-loop: instrument funnel, deep-link shares, /play metadata Sharpen the existing daily-game share loop into something measurable (per Codex's "instrument what you have, then feed people into it" plan), ahead of a Show HN launch. Analytics: - Per-game funnel events <game>_{arrival,started,completed,shared} (article_id=0). arrival = landed via a shared link (utm_source=game_share); started = first move (guess/find/flip); completed = solved/cleared/Full Bloom; shared = on share success. - trackVisit() moved into the global layout so direct /play landings count; the server-rendered /a/ share page now creates a visitor token + sends a daily visit beacon (first-time /a/-only visitors were previously dropped). - Admin "Games funnel" panel: arrivals / engaged / completed / shared, per game. Sharing: - Memory Match gains a Share button (it was the only game without one). - All shares deep-link to the exact game+variant with a full https:// URL + utm_source=game_share (gameShareUrl helper), instead of a bare /play. - "shared" is counted only after navigator.share()/clipboard.writeText() succeeds. /play social metadata: - /play served homepage canonical/OG (static SPA, ssr=false). postbuild script patches build/play.html's head to /play canonical/title/description/OG; fails the build if the homepage tags drift. Caddy try_files now serves {path}.html so /play is served from the patched file (snapshot in deploy/caddy/). Tests: backend 352, frontend 27. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-18 16:22:06 -04:00
thejayman77	89c0fbe1f6	Sync repo to deployed state: SEO recovery, Publishing Desk, Play games, emoji picker The deploy pipeline runs from the working tree, so a wave of shipped features had never been committed. This snapshots git to what's actually running. SEO impression recovery (live + verified): - Duplicate /a/{id} now 301-redirect to their canonical twin instead of 404 (a hard 404 silently dropped already-indexed URLs and tanked impressions). - Dedup representative selection reworked: accepted/serveable -> established rep (URL stability) -> quality score, so an accepted page never retires to a rejected rep and an indexed canonical doesn't churn when a newer twin arrives. - HEAD /a/{id} returns the same status as GET (api_route GET+HEAD) instead of falling through to the static mount and 404ing. - `dedup --force-recluster`: cycle-locked, model-free re-cluster to re-apply the policy to the existing corpus (shared cycle_lock context manager). - CLI honors GOODNEWS_DB for its default --db (was silently ignored). Publishing Desk (admin tool to post highlights to X via Web Intents): - publishing.py queue/rank/handle-resolution; admin UI; full searchable emoji picker (bundled data, no CDN) for the blurb editor. Play games + site: - Bloom (word-wheel), Memory Match, daily ritual set, Zen Den (dev-gated). - English-only language gate; source prospecting; paywall + dedup hardening. Tests: full suite green (349). Ignores tightened (node_modules, data/*.db). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-18 11:32:27 -04:00
thejayman77	2dbe73430c	Sources: per-source paywall override (3-state) — fix domain-rule mis-flags The Articles inspector revealed paywall is domain-coarse: nytimes.com is flagged, so NY Times Learning's free Word-of-the-Day inherits 🔒 — and that flag isn't cosmetic, it deprioritizes the content in feed sort + lead selection. Add a per-source override so admins can correct it after inspecting. - sources.paywall_override: NULL (domain rule) \| 'free' \| 'paywalled'. - paywall.py: keep low-level is_paywalled(url) (domain); add is_paywalled_for_source (url, override) for the EFFECTIVE decision — never patched the domain helper globally (per Codex), so "domain says X" stays distinguishable from "overridden". - Threaded everywhere ranking/UI touches paywall, via src.paywall_override on the shared _ARTICLE_COLUMNS + the source-aware helper: feed sort, /api/since, replace, lead selection, Article badge, brief composition (briefs.py), digest, source_health (table 🔒), the Articles inspector, and the review/attention check — so ranking and UI always agree. - Endpoint POST /api/admin/sources/{id}/paywall {override}; admin UI: a select in the inspector header (Use domain rule / Treat as free / Treat as paywalled) + the basis ("ON (domain)" / "OFF (override)"), optimistic so the panel stays open. Test: domain rule → paywalled in table+inspector+feed badge; 'free' → off in all three; validation 422 + 404. 242 pytest + 11 vitest. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-12 22:10:44 -04:00
thejayman77	ddcfab3a11	Admin: source Articles inspector (verify metrics against real evidence) New per-row "Articles" button on the Sources table expands a read-only inline panel of the source's ACTUAL ingested articles — so the automated metrics (paywall/image/acceptance/duplicate) can be verified against evidence instead of trusted blind. Distinct from "Check" (which re-samples the LIVE feed for would-pass quality); this shows what's already in the DB, which is what the table metrics are computed from. - Backend: GET /api/admin/sources/{id}/articles?filter=&limit=&offset= (admin, read-only). queries.source_articles + source_articles_summary — per article: title, url, date, accepted, reason (the "why"), topic/flavor, paywalled (domain rule), has_image, duplicate. Summary = counts + source-level paywall rule. - Frontend: expandable panel with a summary header ("27 ingested · 18 accepted · … · paywall rule: ON (domain)"), filter chips (All/Accepted/Rejected/No image/Duplicates), compact rows with title→link + badges + reason, Load more. So "100% paywall" or "0% images" becomes clickable evidence: open two articles to tell a real paywall from a mis-flagged domain, or a true image gap from an enrichment failure. Test: test_source_articles_inspector. 241 pytest + 11 vitest. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-12 21:37:51 -04:00
thejayman77	64339aafb0	Games: in-progress hub status + distribution-aware word-search placement (Codex) - Play hub: word cards now surface IN-PROGRESS games too (not just won/lost) so "continue on another device" shows at a glance — card reads "5:3…" and the selection option says "Continue · 3/6". - Word Search generator: replace "prefer any crossing" with a SCORED placement — score = overlap*4 - local crowding (filled neighbours that aren't crossings) — then pick among the best ~20%. Keeps the organic interlocking but spreads words across the board instead of clumping around the first-placed (longest) words. Every word still placed (tests green). NOTE: changes today's grid layouts, so an in-progress word search resets once. 237 pytest + 11 vitest green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-12 15:18:04 -04:00
thejayman77	065ab98598	Games sync hardening (Codex audit): server-side state normalization Don't trust client JSON at the storage layer: - sanitize_game_state() runs before merge AND on the merged result (heals legacy rows). Word Search: keep only finds whose cells actually spell a real word in that day's grid (validated when the puzzle exists, shape-only 4-12 alpha + cell-length otherwise), dedupe, renumber ci. Word: validate status enum, guess count/length/alpha, colour-row shape, terminal answer/why. - Completion is now derived from the real puzzle word count (foundWords == expected), not a client-sent `ms` — so stats can't be inflated by junk. - Date validated as YYYY-MM-DD at the API (400 otherwise) — no junk/future rows. Tests: sanitizer-rejects-junk + bad-date 400; existing tests updated to use real-shaped data (the sanitizer is a good forcing function). 237 pytest + 11 vitest green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-12 13:51:24 -04:00
thejayman77	dd0df64d76	Games: cross-device sync + overlap colour-blend Two game polish items: - Word Search: overlapping cells now multiply-blend the crossing words' colours (deepening to a darker shade with readable text) instead of the newest colour stomping the rest — matches the new interlocking grids. - Cross-device game-state sync (signed-in): per-puzzle progress + stats now follow you between devices. New game_state table; server-side merge on every save so two devices converge regardless of push order, tailored per game: * Word Search → UNION of finds (monotonic; can't un-find), earliest start, best completion time. * Word → furthest-progress wins (terminal beats in-progress; more guesses beats fewer) — picks one device's game whole, never splices guesses. Stats (streak/distribution/best) derived server-side from the synced states, so they're consistent instead of per-device counters. Endpoints GET/PUT /api/games/state + GET /api/games/stats (signed-in; size-capped). Frontend is local-first: games paint instantly from localStorage, then reconcile in the background; both game components push debounced on each move and adopt the merge. Conflict handling unit-tested + an API two-device convergence test. 235→ tests + 11 vitest green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-12 13:35:20 -04:00
thejayman77	2ef0efd909	Perf: skip needless dedup re-cluster + interlock word-search grids Two things found while chasing the recurring ~15min slowness: - dedup.py: cluster_duplicates re-ran an O(n²) cosine pass over ALL ~3.7k articles and rewrote duplicate_of for every one of them EVERY cycle — even when nothing new arrived (embedded=0) — ~53s CPU + a large WAL commit that starved live API reads (/api/brief 2-7s). Now skip the re-cluster entirely when nothing new was embedded (clusters can't have changed). Verified: cycle drops from ~53s to ~1s and /api/brief stays at 20ms through a cycle, vs 2-7s before. (A real new article still triggers a full re-cluster.) - games.py _build_grid: word placement took the first random valid spot, so words rarely crossed. Now gather valid placements and PREFER ones that cross an already-placed word (shared matching letter), falling back to any valid spot — so the grid interlocks like a real word search. Every word still placed (tests green). NOTE: changes today's grid layouts, so an in-progress word search resets once. Also added a systemd drop-in (Nice=19/CPUWeight=20/IOWeight=10/ionice-idle) to deprioritize the batch cycle — minor, the dedup skip is the real fix. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-12 12:35:01 -04:00
thejayman77	ecf879fd1b	Perf: parallelize admin loads + edge-cache /api/brief Two concrete latency wins found by measuring (server compute is 2-17ms; the time is in the path, not the box): - Admin panel fired its 6 API calls SEQUENTIALLY (await chain) — so it paid the uncached origin round-trip six times back-to-back. Now one Promise.all batch. This is the admin lag. - /api/brief (the home "Gathering the good news…" content) wasn't edge-cached, so a distant anonymous visitor triggered a Cloudflare→residential-origin pull. Same global/shareable boundary as /api/feed: public s-maxage=45 when no prefs/exclude, else private,no-store. (Needs /api/brief added to the CF cache rule path list to take effect at the edge.) Tests: test_brief_cache_boundary. 228 pytest + 11 vitest. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-12 09:40:57 -04:00
thejayman77	a34a47fe22	API: edge-cacheable headers for global startup endpoints ("Gathering" speedup) "Gathering the good news…" waits on the home's startup API calls, which were all DYNAMIC → a round-trip to the residential origin every load (the occasional 2-3s linger). These responses depend only on the URL, never the session, so they're safe to share at the edge: - /api/moods, /api/categories (static config) → public, s-maxage=900 - /api/lanes, /api/families (global, data-derived counts) → public, s-maxage=120 - /api/feed → public, s-maxage=45 ONLY when shareable (no following / prefs / exclude); the following feed (reads the session) and personal filters stay private, no-store. Hard personalization boundary, explicit per-endpoint (no blanket /api/* rule). Pairs with a Cloudflare cache rule (added separately) making these paths eligible. Tests assert the global endpoints are public+s-maxage and the feed boundary (default/topic public; following/prefs/exclude private). 227 pytest. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-12 04:34:11 -04:00
thejayman77	c4ea329f9b	Candidate rename hardening (Codex): pending-only + length cap Two small server-side tweaks so the endpoint matches the UI policy: - Rename is refused (409) for promoted/rejected candidates — they're settled history; the UI already hides Rename for them, now the server enforces it too. - Name is capped at 160 chars before save, so an accidental pasted paragraph can't wreck the queue layout. Tests extended: 300-char name truncates to 160; renaming a promoted candidate → 409. 225 pytest + 11 vitest green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-11 21:55:38 -04:00
thejayman77	070b40584e	Candidates: inline rename (fix a name typo without reject + re-add) A staged candidate could only be renamed by rejecting and re-adding it, which churns the queue and discards the preview just to fix a typo. Add an inline Rename on each candidate: a "Rename" pill swaps the name for an input (Enter saves · Esc cancels), POST /api/admin/candidates/{id}/rename → sources.rename_candidate(). Empty clears the name (promote then derives one from the feed host). Preview is preserved; the fixed name carries into promotion. Tests: test_candidate_rename (rename in place keeps preview, promotes with the new name, gated + 404). 225 pytest + 11 vitest green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-11 21:39:13 -04:00
thejayman77	3afc1ed37e	Sources hardening (Codex audit): promote-time dedup, postJSON timeout, host-only feed_key Three follow-ups from Codex's audit of the deep-preview/search/dedup work: - Promote-time duplicate guard: promote_candidate() now re-checks find_existing_feed() and raises DuplicateFeedError → 409, so an old/CLI/direct-DB candidate or a race can't bypass the add-time check and silently overwrite a live source's settings via upsert. (sources scanned first, so a real source collision wins over the candidate matching itself.) - postJSON/putJSON/delJSON gain opt-in {timeout} (AbortController, default none so other calls are unchanged); deep preview uses 120s and surfaces a calm "timed out" message instead of pinning the button on "Deep-checking…" if the LAN model stalls. - feed_key() now lowercases the host only, not the whole URL — paths/queries can be case-significant; scheme/www/trailing-slash/host-case still collapse. Tests: test_candidate_deep_preview_and_dedup extended — promote succeeds once, then a re-promote of the same candidate is refused 409. 224 pytest + 11 vitest. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-11 21:31:39 -04:00
thejayman77	e1ac19351e	Sources: LLM deep-preview, source search, duplicate-add guard Three admin Sources upgrades: - Deep preview: a per-candidate "🔬 Deep preview" button runs the REAL classifier on an 8-item sample (the same model that judges live articles), versus the fast keyword heuristic the add/Re-preview path uses. Preview now carries `classified`, surfaced as a "model-checked" vs "quick estimate" badge — so the acceptance % is no longer ambiguously heuristic. conn is released during the ~30-60s model pass; postJSON has no client timeout. - Search: free-text box over the sources table (name / category / feed URL / homepage), folded into the existing status filter, with a live match count and empty state. Makes "is this already added?" a glance. - Duplicate-add guard: sources.find_existing_feed() + feed_key() normalize scheme/www/trailing-slash/case, so re-adding a feed that's already a live source or a queued candidate is refused with a 409 naming where it lives (DB already enforced exact-URL uniqueness; this catches the near-miss variants and overwrite-on-promote footgun). Tests: test_candidate_deep_preview_and_dedup (deep flag wires the model + uses the small sample; exact/www/slash/case variants all 409). 224 pytest + 11 vitest green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-11 21:19:15 -04:00

1 2 3 4

177 Commits