46 Commits

Author SHA1 Message Date
thejayman77 f416e13700 analytics: honest engagement metric — Engaged readers vs Recorded visits (Codex)
Admin now shows two numbers:
- Recorded visits: the existing raw count (one daily 'visit' beacon; still includes
  UA-spoofing bots that slip past the UA filter).
- Engaged readers: distinct visitor-day with DELIBERATE activity — either the new
  gesture-gated 'engaged' beacon (fires once/day only after ~8s visible AND a real
  scroll/pointer/key/touch) or a deliberate action (source_click, full_story, share,
  replace_used, paywall_replace, not_today/less_like_this/hide_topic, game start/
  complete/share). Explicitly EXCLUDES auto-fired visit/summary_viewed/open, replace_none,
  and game *_arrival (a share-loop landing, not engagement).

armEngaged() in analytics.js (wired in the global layout) + a mirrored vanilla-JS beacon
on the server-rendered /a/<id> share pages. 'engaged' added to the event allowlist and
fired with article_id=0 so the uniqueness constraint dedups it per day. queries.admin_stats
gains engaged_today/d7/d30. Bots are doubly excluded (UA filter at the beacon + the
gesture gate). Tests cover the metric (engaged + deliberate counted; visit/summary/arrival
not). 447 backend + 36 frontend tests pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-30 14:07:24 -04:00
thejayman77 8a7606e20d images: fix two fetcher bugs + add source-level image-rights policy (Codex)
Fetcher (the two remaining bugs Codex found):
- Real redirects are now followed. _NoRedirect makes urllib RAISE HTTPError on 3xx, so
  the old status-branch was dead code (mocked tests masked it). Handle 301/302/303/307/308
  HTTPError as redirects (re-validate the destination); classify 4xx≠429 as PERMANENT
  (negative-cached), 429/5xx/network as transient. Real-opener redirect + 404/5xx tests.
- The megapixel ceiling is now enforced: explicit `w*h > _MAX_PIXELS` check BEFORE load()
  (Pillow only warns at MAX_IMAGE_PIXELS). Test with a lowered ceiling.

Image-rights policy (per Codex + owner decision — only cache what's cleared):
- sources.image_policy: 'cache' (re-host a downscaled copy — license/permission/PD only),
  'remote' (hotlink the publisher's image — the conservative DEFAULT), 'none' (no image).
- newsimg.display_url resolves the display URL per policy; applied in Article.from_row so
  feed/brief/history return the right URL, and in share.py (og/twitter still reference the
  publisher's own image, never re-hosted). warm() + /api/img both gated on 'cache'.
- Frontend uses the server-resolved image_url (reverted the hardcoded /api/img); the
  graceful retry covers remote hotlinks too. Admin: per-source image-policy selector +
  POST /api/admin/sources/{id}/image-policy. Default 'remote' → nothing re-hosted until
  a source is explicitly cleared.

445 backend + 36 frontend tests pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-30 14:01:11 -04:00
thejayman77 d98cec9ded admin: read/unread triage for load errors (unread by default, mark read/all)
The load-error log had no way to clear reviewed entries. Add a read_at column to
client_errors and a read/unread model mirroring the feedback inbox:
- GET /api/admin/client-errors?show=unread|read|all (default unread; returns id+read)
- POST /api/admin/client-errors/read-all  (mark all unread read)
- POST /api/admin/client-errors/{id}/read {read: bool}  (per-row toggle)
Headline stat is now "Unread load errors" (admin_stats.client_errors.unread), so the
red badge clears as you triage. Admin UI: Unread/Read/All tabs, a "Mark all read"
button, and a per-row ✓/↩ toggle; reading an entry drops it from the default view.
14-day auto-prune still bounds the table. Tests cover filter, toggle, mark-all,
404, and gating.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-29 10:38:22 -04:00
thejayman77 c600145ba5 news: close the remaining no-paywall bypass paths (Codex audit)
queries.feed was the main chokepoint, but several discovery paths have their own
SQL. Apply the shared source exclusion to all of them so "no paywalls" is truly
site-wide:
- briefs.build_daily_brief: EXCLUDE paywalled candidates (was: demote) — never
  stored in a new brief.
- queries.brief: stored-brief retrieval (covers /today + /api/brief) filters the
  paywalled source.
- digest.digest_items + followed_digest_items: the morning email + "from what you
  follow" omit paywalled sources.
- sitemap(): paywalled article pages excluded from the sitemap.
All reuse queries.paywalled_source_ids (admin override still wins).

Regression tests (test_paywall_exclusion.py): never stored in a new brief; /today
+ digest omit it; followed-source email omits it; Saved retains it; 'free'
override restores eligibility. 423 backend tests green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-28 17:22:52 -04:00
thejayman77 0d21231597 news: hard-exclude paywalled sources from the feed + brief (no unreadable news)
Per Jay: don't surface stories people can't read without paying — it's off-brand
("no paywalls") and pointless. Paywalled is source-level (domain rule, admin-
overridable): just 3 sources today (Nature, New Scientist, MIT Tech Review),
~5.4% of accepted articles.

- queries.paywalled_source_ids(conn): live source set (admin override wins).
- queries.feed gains include_paywalled=False (default) → adds `a.source_id NOT IN
  (…)`. One chokepoint covers Latest/tags/sources/moods/topics/search/since AND
  the brief top-up. Source-level + SQL → paging stays exact, no frontend change.
- brief(): filter the cached/home pool by the same rule; replacement already
  avoids paywalled and now rides the feed exclusion too.
- Dropped the now-moot "paywalled below readable" demotion sort.
- Saved/history keep showing items you saved (their own queries, not excluded).
- test_source_paywall_override updated: paywalled source → excluded from the feed
  (was: shown with a badge); 'free' override → returns, no badge. 418 tests green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-28 17:10:00 -04:00
thejayman77 dc23277b38 Read-time: full-article "Full story · ~N min" badge (Option B)
Replaces the gist-based read-time with the SOURCE article's full read time — the
contrast that sells the gist ("calm 1-min version here; ~10 min for the deep dive").

- goodnews/readtime.py: word_count_from_html (strips script/style/nav/header/
  footer/form/button/aside furniture before counting) + source_read_minutes
  (~225 wpm, 200-word floor, None when extraction looks failed/too thin).
- articles.source_words + read_checked_at columns (count only, never the body;
  fits the privacy posture). Idempotent migration.
- enrich.fetch_source_words + enrich_read_times: a bounded, retry-guarded cycle
  step (mirrors the image enrichers) that counts words for recent accepted
  articles. Only ever writes a real count; never overwrites good with zero. Wired
  into the cycle after recent-image enrichment.
- queries: source_words flows through _ARTICLE_COLUMNS; api exposes
  source_read_minutes on Article (null when unknown).
- home3: News card shows "Full story · ~N min", hidden entirely when null (no
  misleading "1 min").
- Tests: furniture stripping, threshold/rounding, enrich idempotency + no
  zero-overwrite, API null handling. 412 backend.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 08:09:00 -04:00
thejayman77 3486f3102a Scope dial v2: Nearby / Region / Country / World radius on the homepage
Codex-approved evolution: the reader controls the "emotional radius" of the landing.

- Census-region "Regional" grain (geo.region_of / region_states). Scope-aware tiering
  (queries.home_tiers): closest->widest lead, confidence-gated on state + region, never
  a hard filter — blends outward so the set is always full. 'world' = the global brief.
- queries.home_brief takes a scope; /api/brief gains a scope param (nearby|region|
  country|world). Country-only / non-US homes collapse to country.
- Homepage dial replaces the 2-button toggle: adaptive stops (4 with a US state, else
  Country/World), persisted scope, "Good news closest first" framing. Concrete, soft
  section labels (Around New Jersey / Across the Northeast / Across the US / Around the
  world) so the reader sees the dial worked.

Backend 366 + frontend tests green. (Latest feed still on v1 local-first; aligning it
to the dial is the immediate follow-up.)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 21:59:32 -04:00
thejayman77 d2a6293a13 Local-first Brief: the landing leads with good news from your home
Per the owner's call (overrides the earlier "Brief sacred" stance): when a home is
set, the homepage opens with local good news first, not global. This is the hook —
you land and see awesome stories from YOUR corner first.

- queries.home_brief: local-first highlights (high/medium-confidence near, blended
  out to country then world so it's always a full, strong set), preferring already-
  summarized stories so the calm read stays rich. Recent window, ranked within tier.
- /api/brief gains a `home` param: private/no-store when set; over-fetches + caps so
  dismissal/boundary filtering never thins it; falls back to global top-up if needed.
- Landing UI: a Local <-> Global toggle ("📍 Near you / 🌍 Everywhere") when a home
  is set, the calm picker invite when not (dismissible), and Change. Default leads
  local; one tap back to the global brief. No home set => exactly today's behavior.

Backend + frontend tests green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 21:36:18 -04:00
thejayman77 2239549799 Closer to Home: gate "Near you" on high/medium confidence (both modes)
Codex polish before deploy: anything elevated as Near you / Close to home must have
geo_confidence in (high, medium) — the feature's promise is relevance. Country-only
mode now gates "near" too; since it has no "country" tier, the "world" scope is
widened to absorb low-confidence home-country stories so they surface there instead
of vanishing between tiers (the same edge-case class, fixed). State mode unchanged.

364 tests green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 20:29:31 -04:00
thejayman77 e7e8f5515e Geo Stage 4 (server): home-aware feed sectioning (Near you / country / world)
Completes the server side of "Closer to Home". /api/feed gains a `home` param
('US' or 'US-NY'); when set the response is private (like prefs) and sectioned:

- Near you (+ Elsewhere in your country when a state is set) is a ONE-TIME lead
  block on page 0; the world is the paginated body. next_offset tells the client
  where to continue, so the lead block never skews world paging.
- Thin tiers fold down (MIN_TIER=3) so a header is never shown empty (lead, don't trap).
- State match counts only on high/medium geo confidence; the "country" tier excludes
  exactly what went to "near", so a low-confidence home-state story still surfaces
  (it doesn't vanish between tiers — caught + tested).
- Items carry a `section` tag; paywalled sort is now within-section. No home => exact
  prior behavior (section null, default/edge-cached feed unchanged), Brief untouched.

364 tests green. Frontend next: Home picker + sectioned feed rendering.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 19:35:22 -04:00
thejayman77 ad4e88c8f2 Geo Stage 4 (data layer): geo on feed responses + home-scope query filters
Foundation for "Closer to Home" (server-side, Codex-approved). No behavior change
yet — geo_scope defaults None, so the default/edge-cached feed is identical.

- queries.feed now returns each article's geo (breadth, confidence, and ISO-coded
  places) via a LEFT JOIN + places subquery. Article.from_row parses geo_places
  into [{country, state}]. Brief query doesn't select geo, so the Brief stays bare.
- queries.feed gains home-scope filters (home_country/home_state/geo_scope =
  near|country|world): STATE match only counts on high/medium geo confidence;
  untagged articles fall to 'world' so nothing is lost during backfill.

Next: API composition (home param + near/country/world sectioning with soft/blended
headers + a next_offset pagination model) and the Home picker UI. 360 tests green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 19:30:43 -04:00
thejayman77 59ff48ae90 Game share-loop: instrument funnel, deep-link shares, /play metadata
Sharpen the existing daily-game share loop into something measurable (per Codex's
"instrument what you have, then feed people into it" plan), ahead of a Show HN launch.

Analytics:
- Per-game funnel events <game>_{arrival,started,completed,shared} (article_id=0).
  arrival = landed via a shared link (utm_source=game_share); started = first move
  (guess/find/flip); completed = solved/cleared/Full Bloom; shared = on share success.
- trackVisit() moved into the global layout so direct /play landings count; the
  server-rendered /a/ share page now creates a visitor token + sends a daily visit
  beacon (first-time /a/-only visitors were previously dropped).
- Admin "Games funnel" panel: arrivals / engaged / completed / shared, per game.

Sharing:
- Memory Match gains a Share button (it was the only game without one).
- All shares deep-link to the exact game+variant with a full https:// URL +
  utm_source=game_share (gameShareUrl helper), instead of a bare /play.
- "shared" is counted only after navigator.share()/clipboard.writeText() succeeds.

/play social metadata:
- /play served homepage canonical/OG (static SPA, ssr=false). postbuild script
  patches build/play.html's head to /play canonical/title/description/OG; fails the
  build if the homepage tags drift. Caddy try_files now serves {path}.html so /play
  is served from the patched file (snapshot in deploy/caddy/).

Tests: backend 352, frontend 27.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-18 16:22:06 -04:00
thejayman77 89c0fbe1f6 Sync repo to deployed state: SEO recovery, Publishing Desk, Play games, emoji picker
The deploy pipeline runs from the working tree, so a wave of shipped features
had never been committed. This snapshots git to what's actually running.

SEO impression recovery (live + verified):
- Duplicate /a/{id} now 301-redirect to their canonical twin instead of 404
  (a hard 404 silently dropped already-indexed URLs and tanked impressions).
- Dedup representative selection reworked: accepted/serveable -> established
  rep (URL stability) -> quality score, so an accepted page never retires to a
  rejected rep and an indexed canonical doesn't churn when a newer twin arrives.
- HEAD /a/{id} returns the same status as GET (api_route GET+HEAD) instead of
  falling through to the static mount and 404ing.
- `dedup --force-recluster`: cycle-locked, model-free re-cluster to re-apply the
  policy to the existing corpus (shared cycle_lock context manager).
- CLI honors GOODNEWS_DB for its default --db (was silently ignored).

Publishing Desk (admin tool to post highlights to X via Web Intents):
- publishing.py queue/rank/handle-resolution; admin UI; full searchable emoji
  picker (bundled data, no CDN) for the blurb editor.

Play games + site:
- Bloom (word-wheel), Memory Match, daily ritual set, Zen Den (dev-gated).
- English-only language gate; source prospecting; paywall + dedup hardening.

Tests: full suite green (349). Ignores tightened (node_modules, data/*.db).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-18 11:32:27 -04:00
thejayman77 2dbe73430c Sources: per-source paywall override (3-state) — fix domain-rule mis-flags
The Articles inspector revealed paywall is domain-coarse: nytimes.com is flagged,
so NY Times Learning's free Word-of-the-Day inherits 🔒 — and that flag isn't
cosmetic, it deprioritizes the content in feed sort + lead selection. Add a
per-source override so admins can correct it after inspecting.

- sources.paywall_override: NULL (domain rule) | 'free' | 'paywalled'.
- paywall.py: keep low-level is_paywalled(url) (domain); add is_paywalled_for_source
  (url, override) for the EFFECTIVE decision — never patched the domain helper
  globally (per Codex), so "domain says X" stays distinguishable from "overridden".
- Threaded everywhere ranking/UI touches paywall, via src.paywall_override on the
  shared _ARTICLE_COLUMNS + the source-aware helper: feed sort, /api/since, replace,
  lead selection, Article badge, brief composition (briefs.py), digest, source_health
  (table 🔒), the Articles inspector, and the review/attention check — so ranking and
  UI always agree.
- Endpoint POST /api/admin/sources/{id}/paywall {override}; admin UI: a select in the
  inspector header (Use domain rule / Treat as free / Treat as paywalled) + the basis
  ("ON (domain)" / "OFF (override)"), optimistic so the panel stays open.

Test: domain rule → paywalled in table+inspector+feed badge; 'free' → off in all
three; validation 422 + 404. 242 pytest + 11 vitest.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-12 22:10:44 -04:00
thejayman77 ddcfab3a11 Admin: source Articles inspector (verify metrics against real evidence)
New per-row "Articles" button on the Sources table expands a read-only inline
panel of the source's ACTUAL ingested articles — so the automated metrics
(paywall/image/acceptance/duplicate) can be verified against evidence instead of
trusted blind. Distinct from "Check" (which re-samples the LIVE feed for
would-pass quality); this shows what's already in the DB, which is what the table
metrics are computed from.

- Backend: GET /api/admin/sources/{id}/articles?filter=&limit=&offset= (admin,
  read-only). queries.source_articles + source_articles_summary — per article:
  title, url, date, accepted, reason (the "why"), topic/flavor, paywalled
  (domain rule), has_image, duplicate. Summary = counts + source-level paywall
  rule.
- Frontend: expandable panel with a summary header ("27 ingested · 18 accepted
  · … · paywall rule: ON (domain)"), filter chips (All/Accepted/Rejected/No
  image/Duplicates), compact rows with title→link + badges + reason, Load more.

So "100% paywall" or "0% images" becomes clickable evidence: open two articles
to tell a real paywall from a mis-flagged domain, or a true image gap from an
enrichment failure. Test: test_source_articles_inspector. 241 pytest + 11 vitest.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-12 21:37:51 -04:00
thejayman77 628cc5722c Reliability: slow ≠ failed — SW nav timeout, slow-boot telemetry, de-bot stats
Root cause of the intermittent white screen: the shell HTML is no-cache
(cf-cache-status: DYNAMIC), so every page-open does a synchronous round-trip
to the residential origin before any pixel renders — and the SW's network-first
navigation only fell back to the cached shell on REJECTION, never on slowness.
A stalled fetch meant staring at white with a perfectly good shell in cache.
The boot seatbelt couldn't see it either: it lives inside the HTML that hadn't
arrived yet, so slow boots left no telemetry.

- service-worker: race navigation fetch vs 2.5s grace timer. Network wins →
  fresh HTML as before; timer/5xx/failure → cached shell instantly, network
  response still refreshes the cache in the background. Safe due to the 14-day
  immutable-chunk grace window. Caps the white screen at ~2.5s for repeat
  visitors on any network.
- app.html: beacon `boot-slow: Nms (html Nms) on 4g` when mount takes >4s —
  the "white screen, then it loaded" glitches finally leave a trace, with
  HTML-arrival timing to separate slow-origin from slow-JS.
- admin: bot UAs (HeadlessChrome/bot/spider/crawl/…) excluded from the
  headline "Load errors today" count — throttled crawlers trip the 10s boot
  check routinely (the one recorded error was HeadlessChrome on X11, not a
  phone). Bots stay visible in the list, tagged + dimmed.

Tests: telemetry test extended for bot flag + filtered counts. 223 pytest +
11 vitest green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 19:23:33 -04:00
thejayman77 61f575ba6d Observability + warming guardrails (Codex)
* client_error details, not just a count: new client_errors table + POST
  /api/client-error (reason/path/user-agent/time) + GET /api/admin/client-errors.
  The boot-seatbelt beacon now sends the reason + path (once per page); the admin
  Overview lists the recent errors so we can tell chunk vs SW vs API vs JS — the
  truth meter for the next day as the new SW propagates.
* Deploy warming now also hits the shell, routes (/play /account /admin), SW,
  version.json, word lists, and icons/logo/font — not just immutable chunks.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 12:31:32 -04:00
thejayman77 9e387a0a09 Boot-failure seatbelt: no future crash becomes a silent white screen
Per Codex. A branded recovery card in app.html shows if the app hasn't mounted
in 7s, or on a pre-mount JS error/unhandledrejection — with a "Refresh Upbeat
Bytes" button. A chunk/preload failure (vite:preloadError) reloads once
(sessionStorage-guarded). +layout calls window.__ubBooted() on mount to clear
the card + timer. A pre-mount failure also fires a tiny anonymous client_error
beacon; the admin Overview now shows "Load errors today" (red if >0) so we can
see if blank-risk is happening in the wild.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 12:10:46 -04:00
thejayman77 d0fb153e46 "Since you last visited" cue + PWA install (add to home screen)
Two calm returning-reader features.

Since-last-visit (Highlights companion, not a nav lane — per Codex):
* queries.feed gains a `since` filter; GET /api/since?ts= returns the count +
  a few accepted/non-dup/visible articles discovered since the reader's last
  visit (boundary-respecting; invalid/future ts → 0, no error).
* Home stores last_seen in localStorage (reads prev, then stamps now); on
  Highlights, a gentle "Since you were last here, N new calm reads came in"
  note with a "See what's new" reveal of a compact inline section. Dismissible.
  No badges, no unread counts, no "missed" language.

PWA:
* Real PNG icons (192/512 + full-bleed maskable) rasterized from favicon.svg;
  manifest fixed (azure theme to match the brand, PNG icons); apple-touch-icon.
* Minimal service worker: precache the app shell, always-fresh API + /a/ pages.
* Gentle, dismissible install banner (beforeinstallprompt → Install; iOS → the
  Share → Add to Home Screen hint). Never nags.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 20:38:12 -04:00
thejayman77 d8e246b4ff Follow source/topic — account-backed personalization (v1)
Per Codex — turn accounts into a real reason to return, without an algorithmic
feed. Durable interests (sources + tags), not moods.

* DB: user_follows (user_id, kind source|tag, value, unique).
* queries.feed gains follow_sources/follow_tags → the Following feed is
  "articles from a followed source OR carrying a followed tag", still respecting
  calm filters/boundaries.
* API: GET/POST/DELETE /api/follows (sign-in required; source ids validated);
  /api/feed?following=true resolves the user's follows (anon → empty, not error).
* Frontend: follows store (followKeys + toggleFollow, mirrors savedIds); a
  Follow button on source + tag/topic views; a "Following" lane in the nav with
  a tailored empty state; a Following management section in Account (unfollow).

Digest "From what you follow" deferred to v2 (brief stays first).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 17:34:46 -04:00
thejayman77 eacf91225a Sources table: Media column (image coverage % + paywall marker)
Per Codex — make the table more decision-ready from data we already have.
Paywall is a domain-level hint, so it's a per-source flag (not a meaningful
rate): show image-coverage % plus a 🔒 marker for subscription domains in one
compact "Media" column (tooltip spells it out). source_health gains a
`paywalled` flag (is_paywalled on homepage/feed); also added to sources.csv.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 14:58:19 -04:00
thejayman77 1cd7f1d89a Admin CSV export (sources snapshot + audience time-series)
Per Codex v1 — boring-in-the-best-way: inspect/archive operational data outside
the app. Admin-gated, Python csv module, text/csv + attachment disposition.

* GET /api/admin/export/sources.csv — current-state snapshot per source: name,
  feed/homepage, status, visible, served/accepted/total, acceptance/duplicate/
  accepted-dup/image-coverage %, last success/error, retry-after, review.
* GET /api/admin/export/audience.csv?days= — summary block (visitors, returning,
  accounts, feedback, shares) + a blank line + the daily visits/opens series;
  range applies to audience, sources is a snapshot.
* source_health now also returns feed_url/homepage. Small download links on the
  Sources + Audience tabs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 13:05:09 -04:00
thejayman77 26014297f4 Attention: long rate-limit item scans active sources only
Per Codex: a paused/retired source with a future retry_after_at shouldn't nag
'rate-limited for 12h+' — it's intentionally out of polling. Scope long_rest to
active (matching the other operational items). Test: paused/retired rate-limited
sources stay quiet.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 12:09:17 -04:00
thejayman77 d2e2b303ac Attention strip: richer source-health items (stale/reject/dup/thin/rate-limit)
Per Codex — make the Overview strip diagnostic without making the operator hunt
through tables. Aggregated (one calm line per condition with a count), volume-
gated, conservative thresholds:

* Stale: active+visible source, last success > 10 days ago (warn).
* High rejection: >=20 ingested, acceptance < 25% (info).
* High duplicate: >=10 accepted, accepted-dup > 50% (info).
* Thin images: >=10 served, per-source image coverage < 25% (info).
* Long rate-limit: retry_after_at more than 12h out (info).

source_health gains a per-source images count + image_coverage. _attention takes
an optional now (for tests). Existing site-wide items (global image coverage,
thin brief, unread feedback) unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 11:50:17 -04:00
thejayman77 01de5a3ef0 source_health: next_due_at = later of streak-backoff and retry_after_at
Per Codex: the Next poll column computed only the streak-backoff time, so a
rate-limited source could show an earlier Next poll than the real gate (which
also requires retry_after_at <= now). Take the later of the two in the Python
post-process so the admin table agrees with due_source_rows.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 11:45:54 -04:00
thejayman77 38abc26ddd Honor Retry-After on HTTP 429 (polite rest, not a failure)
Per Codex's spec — a publisher saying "slow down" shouldn't make a feed look
broken, but repeated 429s stay visible via last_success_at / stale-source.

* Schema: sources.retry_after_at (nullable) + migration.
* feeds.parse_retry_after: delta-seconds OR HTTP-date → UTC stamp; ignores
  invalid/negative/past; caps at now + MAX_BACKOFF_MINUTES.
* fetch_feed raises RateLimited (carrying the parsed time) on a 429.
* poll_source: on 429 set retry_after_at + last_error, status='rate_limited',
  and do NOT increment consecutive_failures; on success clear retry_after_at;
  non-429 failures unchanged.
* due_source_rows requires BOTH the streak backoff elapsed AND retry_after_at
  passed (i.e. the later of the two).
* Admin: source_health returns retry_after_at; status reads
  "rate-limited · rests until …" rather than "failed/resting".

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 10:47:40 -04:00
thejayman77 35aaeece6d Fix status/active mirror drift in upsert_sources (pre Promote-candidate)
Per Codex: upsert_sources() wrote `active` but not `status`, so a candidate
promoted inactive (the pipeline default) became active=0 + status='active' —
the exact mirror drift Phase 1 set out to avoid (scheduler won't poll, admin UI
shows "active"). Now derive status from an explicit value or from active, mirror
active off status, and write both columns together (insert + conflict update).
Test: promote_candidate(active=False) → status='paused', active=0.

Also fix stale source_health docstring (now includes retired).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 10:12:26 -04:00
thejayman77 9ed817c051 Source Retire lifecycle (Phase 1: status + content_visible, active mirrored)
Per Codex's plan — introduce a lifecycle without a risky "change the source of
truth everywhere" moment.

* Schema: sources.status (active|paused|retired) + content_visible; migration
  backfills status from active (active=1→active, else paused), content_visible=1.
* `active` is kept as a SYNCED MIRROR: status active→active=1, paused/retired→0,
  so the scheduler/CLI/legacy code keep working unchanged.
* Retire stops polling but keeps articles visible (non-destructive). Hiding is a
  separate, reversible lever: content_visible=0 drops a source's articles from
  the public feed + brief (read AND build), behind a confirm. Personal saved/
  history are untouched.
* API: /sources/{id}/status (validates, mirrors active) + /visibility, replacing
  /active. source_health returns status + content_visible.
* Admin: status column (active/paused/retired + "hidden"), Retired filter,
  Pause/Resume · Retire/Restore · Hide/Show actions.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 09:58:15 -04:00
thejayman77 9deca522b4 Sources: accepted-duplicate % (curation-quality signal)
Per Codex's optional note: alongside the ingest-wide duplicate_rate, expose
accepted_dup_rate — of what a source got ACCEPTED, how much was a duplicate of
already-served content (accepted_total − served). Nearly free (derived from
existing counts); surfaced as a tooltip on the Dup column so the table stays
uncluttered.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 15:47:51 -04:00
thejayman77 84bc5b0267 Source management console: pause/resume, flag/clear, decision metrics
Turn the Sources tab into a real management console (per Codex):

* source_health now lists ALL sources (active + paused) with backing metrics:
  served / accepted_total / total_articles / duplicates + acceptance & duplicate
  rates + review_reason, alongside last success/attempt, next poll, failures.
* Admin endpoints (gated, 404 on missing): POST sources/{id}/active (pause/
  resume) and /review (flag/clear with reason).
* Pausing only stops future polling — the feed query has no active filter, so a
  paused source's accepted articles stay live.
* Frontend: metric table + Paused filter + per-row Pause/Resume & Flag/Clear
  (optimistic, revert on failure). Attention 'resting' now scoped to active.

Retire/Delete intentionally deferred (distinct lifecycle state, later).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 14:04:40 -04:00
thejayman77 ecaca35977 Feedback inbox: read/unread + delete
Make the admin Feedback section a real inbox.

* DB: feedback.read_at column (schema + idempotent migration).
* API: feedback list returns read_at; POST /api/admin/feedback/{id}/read
  {read} toggles it; DELETE /api/admin/feedback/{id} removes a message
  (both admin-gated). admin_stats gains feedback_unread; the Attention strip
  and the tab badge now count UNREAD, not total.
* Frontend: unread messages are highlighted with an accent rail + dot; an
  Unread filter joins the category chips; each message has Mark read/unread
  and Delete (confirm), with optimistic updates that revert on failure.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 13:33:24 -04:00
thejayman77 13722f04a8 Admin polish: section fallback, live-scoped coverage, dual source status
Per Codex audit:
* Unknown ?section= values now clamp to Overview, so the page never renders the
  tabs with an empty body.
* Summary/image coverage counts join through articles+scores and require
  accepted=1 AND duplicate_of IS NULL, so percentages stay ≤100% and honest as
  rejected/duplicate rows accrue summaries over time.
* A source that's both resting and flagged now shows "⚠ resting · review"
  rather than hiding the review flag behind the resting state.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 13:19:15 -04:00
thejayman77 575f562ad5 Admin: tabbed operator console (Overview/Content/Sources/Audience/Feedback)
Reshape the long single-page dashboard into a sectioned console (one route,
?section= tabs, sticky subnav) focused on "what needs my attention" first.

* Overview: an "Attention Needed" strip (soft amber/blue, never alarming red)
  derived from the same data — sources resting/flagged, image coverage <70%,
  thin brief, recent feedback — plus at-a-glance pulse cards.
* Content: corpus health + image/summary coverage (with_image, summaries_with_
  image, brief image coverage, 24h image misses) + top opened / topics / tags.
* Sources: filterable table (All/Healthy/Resting/Flagged) — served, last
  success, next poll, failure streak, status — instead of a card pile.
* Audience: visitors, retention, accounts, funnel, sharing, daily trend.
* Feedback: inbox with category filter, newest first, quick mailto reply.

Backend: content_stats gains added_7d + image-coverage fields; source_health
gains review_flag; admin_stats adds attention[] + feedback_7d.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 13:03:23 -04:00
thejayman77 38889f76e5 Source feeds: click a source to see its publication feed
Click a source name on any card → a feed of just that source's articles,
newest-first, still accepted / non-duplicate / boundary-filtered (the calm
promise isn't bypassed). A natural way to follow a publication's feel.

* queries.feed + /api/feed: source_id filter; Article output gains source_id.
* Frontend: source label is a button → transient 'source:<id>' view (like
  'tag:<slug>'), rendered in the feed grid with Load more, header = source name.
* Ad-hoc, not a pinned lane. Foundation for a future source page (metadata) +
  Follow; shareable /source/<slug> route and source_view analytics come then.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 08:30:33 -04:00
thejayman77 c25e14ed6a Add a permanent "Latest" lane beside "Highlights"
Restructure the nav around two permanent lanes, then the reader's chosen ones:
"Highlights" (the curated daily brief — formerly "Today") and "Latest" (the
freshest accepted stories, newest-first). Now that the gate is tight, a
chronological "incoming" feed is safe to expose.

* feed(): new sort="latest" (pure recency) alongside the default best-first
  rank; /api/feed exposes sort=ranked|latest (validated). Still accepted-only
  and boundary-respecting either way.
* lanes.py: two pinned lanes (Highlights + Latest) instead of one.
* Home: "Latest" view + "Load more" pagination for every feed view (offset-
  paged, de-duped). Mobile bottom bar gains a Latest tab.
* LanePicker shows both pinned lanes; nav rail renders them first.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 15:56:48 -04:00
thejayman77 d87347b032 Dashboard: content + source-health; per-viewer local dates
* Date fix: introduce GOODNEWS_TZ (goodnews/localtime.py) so the brief's "today"
  rolls over in a pinned zone (Eastern) instead of UTC — robust to host-clock
  resets. The home page now formats the brief's date in each VISITOR's local
  timezone (from its UTC freshness stamp), so nobody ever sees "tomorrow."

* Admin "Content served": articles live, fresh (7d), ingested (24h), summaries,
  active sources, today's brief size — queries.content_stats().

* Admin "Source health": per active source, the failure streak, last error,
  accepted contribution, and computed next-poll time (so backoff / "resting
  until" is visible), via queries.source_health() reusing the feeds backoff
  math. Failing sources sort to the top; times render in the viewer's zone.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 19:34:22 +00:00
thejayman77 427210ac3e User feedback + expanded privacy-respecting admin stats
Feedback:
- feedback table; POST /api/feedback (anonymous-ok, optional category/email,
  honeypot + per-day flood cap) stores + emails the admin; GET /api/admin/feedback.
- Shared feedback store + FeedbackModal; a speech-bubble opens it from the desktop
  header, the mobile top bar (logo moves left), the footer, and /account. Feedback
  section in /admin.

Stats (additive, same privacy model — no IP/UA/referrer/raw terms):
- Event vocab: summary_viewed (fired on /a load), full_story (card → source),
  not_today/less_like_this/hide_topic, replace_used/replace_none, paywall_replace,
  paywalled_source_open. Card title/image opens /a (no double-count); history
  records via keepalive so it survives the nav.
- Dashboard: Accounts card (counts only), reading funnel (summary→source rate),
  emotional-mix & friction, paywall, returning-visitor buckets. (Health metrics
  deferred to a future monitoring dashboard.) 131 tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 12:58:49 +00:00
thejayman77 cfde4e22db Summary briefing layer: Today pre-summarized, /a is the canonical read
Make summaries the core reading experience (summary-first, source-forward):
- Cycle pre-warms summaries for Today's 7 (idempotent → only new ones hit the LLM).
- /api/brief items carry their cached summary; Today cards (hero + tiles) show it
  inline, so Today reads as a calm briefing.
- Card title/image now open the /a summary page (the canonical artifact), with a
  visible "Full story" link straight to the source on every card (the escape hatch).
- /a gains related-grouping chips + a Copy-link/share control.
- Tighten the summary prompt: original, factual, no quotations / no close paraphrase.
Long tail stays lazy+cached. No article bodies stored. 129 tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-04 19:48:32 +00:00
thejayman77 762f121320 Admin step B: stats endpoint + /admin dashboard
- users.is_admin (+ migration); admin = is_admin OR email in GOODNEWS_ADMIN_EMAILS
  (normalized). is_admin exposed on /api/auth/me. Server-authorized GET
  /api/admin/stats (403 for non-admins).
- queries.admin_stats: visitors (today/7d/30d), returning vs one-and-done, top
  opened articles, popular groupings + topics (derived from article_id at query
  time), share breakdown, daily opens/visits trend — all aggregate, no PII.
- /admin page (gated, redirects non-admins): stat cards, CSS bar lists, a daily
  trend; "Admin dashboard" link on /account for admins. 129 tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 18:25:46 +00:00
thejayman77 409bb11444 Accounts Phase 3: save articles, account history, device import
- API (auth-required): GET/POST/DELETE /api/saved (+/api/saved/ids), GET/POST
  /api/history, POST /api/import — all FK-safe (skip ids that no longer exist).
  queries.saved/saved_ids/history reuse the feed article shape.
- Frontend: reactive savedIds store (SvelteSet) + optimistic toggleSave; a Save
  control on cards for signed-in users; a "Saved" view (You sheet) with its own
  empty state; newly-seen items mirror to account history (cross-device); and a
  one-time import folds this device's anonymous history into the account on first
  sign-in. Anonymous browsing unchanged. 115 tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 12:56:31 +00:00
thejayman77 a47a1504c8 Phase B1: multi-tag groupings model (backend)
Three-layer organization: primary topic (one per article, for ranking and
brief balance) + grouping tags (1-4 per article from a controlled vocabulary,
the organic "wandering" axis) + tonal flavor.

- taxonomy: add technology + learning topics; 4 calm tag families
  (Discovery & Wonder, People & Kindness, Solutions & Progress, Mind & Craft)
  defined in code, not the DB; ALLOWED_TAGS union + coerce_tags validation.
- db: article_tags(article_id, tag) join table + tag index.
- llm: tags added to the classifier json_schema (enum-constrained, maxItems 4)
  and system prompt; normalize_scores coerces tags; upsert_article_score
  replaces a row's tags atomically on every (re)classification.
- queries: feed gains a tag filter and exposes tags via group_concat; tag_counts.
- api: Article.tags, feed tag param, and /api/families with per-tag counts.
- tests: coerce/normalize/upsert/tag-filter/reclassify-replace/tag_counts +
  /api/families. 99 passing.

Corpus reclassify (re-tag + new primary topics) runs separately against the
local LLM. Frontend (B2) pairs with this; the live site is unchanged until then.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 18:35:25 +00:00
thejayman77 68a401eed6 Fresh server data overrides a pinned brief; pin holds otherwise
Per the agreed model: the brief is server-authoritative and a client Replace is
a soft override that yields when genuinely new data arrives.
- build_daily_brief is now idempotent: if the composed selection is unchanged it
  leaves the brief (and its created_at) alone, so the timer's 15-min rebuilds are
  no-ops when no new data landed.
- /api/brief exposes generated_at (the brief's created_at = a content-change
  stamp). The client pins its view against generated_at and keeps it across plain
  refreshes, but drops it and shows the fresh server brief when generated_at
  advances. Missed stories remain in the mood feeds.

Tests: idempotent rebuild (no-op vs content change) — 93 total.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 14:00:08 +00:00
thejayman77 5601022cf7 Build the SvelteKit frontend: calm home with mood modes
- New frontend/ SvelteKit static SPA (Svelte 5), served by FastAPI from
  frontend/build (falls back to the legacy page if unbuilt).
- Calm design system: cream/sage palette, serif headlines, generous space,
  no urgency colors, gentle motion (respects prefers-reduced-motion).
- Home screen: mood-mode nav (Today/Wonder/People Helping/Solutions/Light
  Only/Grounded), the daily brief as a hero + remaining four, browsable mood
  lanes, an explicit calm end-state, inline Not today / Less like this / Hide
  affordances, and device-local Calm Filters mirroring goodnews/filters.py.
- Backend: moods.py + GET /api/moods (single source of truth for the modes);
  FilterPrefs gains max_cortisol/max_ragebait ceilings (for Light Only).
- Push categorical filters (include/mute topics+flavors, ceilings) into SQL in
  queries.feed so low-ranked-but-matching items (e.g. discovery for Wonder)
  are not truncated by ranking; only avoid-terms stay a Python pass.
- PWA manifest + icon (installable; offline deferred per plan).
- Multi-stage Dockerfile builds the site then serves it from the API.
- Tests: queries.feed categorical filters (63 total). README updated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 22:27:46 +00:00
thejayman77 b1530e4a4f Exclude duplicates from category counts so browse totals match the feed
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 16:01:12 +00:00
thejayman77 5d44072fca Add semantic cross-source dedup via local embeddings
- LocalModelClient.embed() calls the OpenAI-compatible /embeddings endpoint
  (local nomic model); base_url shared with chat, model via GOODNEWS_EMBED_MODEL.
- New article_embeddings table and articles.duplicate_of column (+ migration).
- dedup module: embeds missing articles, clusters near-identical stories within
  a date window by cosine similarity (pure-stdlib, vectors normalised once), and
  marks all but the highest-ranked member of each cluster as a duplicate.
- 'dedup' CLI command; cycle now runs poll -> classify -> dedup -> brief.
- Feed and brief queries hide duplicates, so a story carried by multiple
  outlets shows once.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 15:40:55 +00:00
thejayman77 2f4bdf2d00 Add FastAPI web/API layer and static site
- queries.py: shared read-only query helpers (feed, brief, category counts)
  returning plain dicts, used by the API and available to the CLI.
- api.py: FastAPI service with Pydantic response models (the companion-app
  contract), CORS, and endpoints for categories, feed, brief, and health;
  mounts a static site at /.
- static/index.html: minimal dependency-free site rendering the daily five
  and topic/flavor category browsing.
- 'goodnews serve' command launches uvicorn (lazy import; core CLI stays
  pure-stdlib). Web deps live behind the optional [web] extra.
- Dockerfile + .dockerignore + build-system metadata so the service installs
  and deploys cleanly, with the DB mounted as a shared volume.
- README: web/API and deployment docs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 13:51:07 +00:00