39 Commits

Author SHA1 Message Date
thejayman77 8a7606e20d images: fix two fetcher bugs + add source-level image-rights policy (Codex)
Fetcher (the two remaining bugs Codex found):
- Real redirects are now followed. _NoRedirect makes urllib RAISE HTTPError on 3xx, so
  the old status-branch was dead code (mocked tests masked it). Handle 301/302/303/307/308
  HTTPError as redirects (re-validate the destination); classify 4xx≠429 as PERMANENT
  (negative-cached), 429/5xx/network as transient. Real-opener redirect + 404/5xx tests.
- The megapixel ceiling is now enforced: explicit `w*h > _MAX_PIXELS` check BEFORE load()
  (Pillow only warns at MAX_IMAGE_PIXELS). Test with a lowered ceiling.

Image-rights policy (per Codex + owner decision — only cache what's cleared):
- sources.image_policy: 'cache' (re-host a downscaled copy — license/permission/PD only),
  'remote' (hotlink the publisher's image — the conservative DEFAULT), 'none' (no image).
- newsimg.display_url resolves the display URL per policy; applied in Article.from_row so
  feed/brief/history return the right URL, and in share.py (og/twitter still reference the
  publisher's own image, never re-hosted). warm() + /api/img both gated on 'cache'.
- Frontend uses the server-resolved image_url (reverted the hardcoded /api/img); the
  graceful retry covers remote hotlinks too. Admin: per-source image-policy selector +
  POST /api/admin/sources/{id}/image-policy. Default 'remote' → nothing re-hosted until
  a source is explicitly cleared.

445 backend + 36 frontend tests pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-30 14:01:11 -04:00
thejayman77 d98cec9ded admin: read/unread triage for load errors (unread by default, mark read/all)
The load-error log had no way to clear reviewed entries. Add a read_at column to
client_errors and a read/unread model mirroring the feedback inbox:
- GET /api/admin/client-errors?show=unread|read|all (default unread; returns id+read)
- POST /api/admin/client-errors/read-all  (mark all unread read)
- POST /api/admin/client-errors/{id}/read {read: bool}  (per-row toggle)
Headline stat is now "Unread load errors" (admin_stats.client_errors.unread), so the
red badge clears as you triage. Admin UI: Unread/Read/All tabs, a "Mark all read"
button, and a per-row ✓/↩ toggle; reading an entry drops it from the default view.
14-day auto-prune still bounds the table. Tests cover filter, toggle, mark-all,
404, and gating.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-29 10:38:22 -04:00
thejayman77 ed814c97b9 Daily Art engine: museum-guide blurb (grounded LLM) + extracted palette
- daily_art gains blurb + palette columns (idempotent migration).
- art._palette: Pillow median-cut to ~5 hex colors from the cached image (best-
  effort → [] on any failure). art._blurb: a warm 2-3 sentence "what you're
  looking at" note grounded in the Met catalogue (title/artist/bio/date/medium/
  classification/culture/tags). Prompt leans on context/significance and the
  title+tags for subject — explicitly NOT asserting literal composition (figure
  counts/poses) it can't see, since the model can't view the image. Markdown
  stripped from the output.
- pick_daily generates both (client optional → blurb skipped when absent); cycle
  + art CLI pass an LLM client. /api/art/today exposes blurb + palette.
- Backfilled the last 3 days on host (Veteran / Magnolia Vase / Bierstadt).
- scripts/art_blurb_palette_backfill.py for in-place backfill (no re-pick).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 20:12:54 -04:00
thejayman77 dc23277b38 Read-time: full-article "Full story · ~N min" badge (Option B)
Replaces the gist-based read-time with the SOURCE article's full read time — the
contrast that sells the gist ("calm 1-min version here; ~10 min for the deep dive").

- goodnews/readtime.py: word_count_from_html (strips script/style/nav/header/
  footer/form/button/aside furniture before counting) + source_read_minutes
  (~225 wpm, 200-word floor, None when extraction looks failed/too thin).
- articles.source_words + read_checked_at columns (count only, never the body;
  fits the privacy posture). Idempotent migration.
- enrich.fetch_source_words + enrich_read_times: a bounded, retry-guarded cycle
  step (mirrors the image enrichers) that counts words for recent accepted
  articles. Only ever writes a real count; never overwrites good with zero. Wired
  into the cycle after recent-image enrichment.
- queries: source_words flows through _ARTICLE_COLUMNS; api exposes
  source_read_minutes on Article (null when unknown).
- home3: News card shows "Full story · ~N min", hidden entirely when null (no
  misleading "1 min").
- Tests: furniture stripping, threshold/rounding, enrich idempotency + no
  zero-overwrite, API null handling. 412 backend.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 08:09:00 -04:00
thejayman77 cebbed58ab WOTD #4/#5 content quality + Editorial Asymmetric /word page (CD)
Content quality ("LLM polishes, dictionary anchors"):
- New wotd._polish: rewrites the real dictionary gloss into ONE warm plain
  sentence + two clear everyday example sentences, grounded in the real
  definition (no invented meanings). Stored in new wotd_pool/daily_wotd columns
  gloss + usage, alongside the raw definition/examples which stay the anchor.
- harvest() polishes each new word; pick_daily() lazily polishes + caches back
  any older pooled word that lacks a gloss (client threaded through run_daily).
- Admin word-add polishes on insert; re-pick passes an LLM client so quote
  meaning / word gloss fill on a forced fresh pick.
- /api/word/today now prefers gloss + usage, falling back to the raw dictionary
  def/examples when polish is absent (so it's always safe).
- db._migrate adds gloss/usage to wotd_pool + daily_wotd (idempotent ALTER).

Frontend — /word redesigned to CD's "Editorial Asymmetric": faded oversized
initial bleeding off the right, vertical part-of-speech rail, big Newsreader
word, airy definition, left-ruled italic example sentences, outline Listen
button + date. (Uses our self-hosted Newsreader/Hanken stack rather than the
mockup's Google fonts; the made-up syllable respelling is omitted since we only
have real IPA.)

Tests: _polish parse/trim/cap, harvest stores gloss/usage, pick lazy-polishes
older words, admin gloss flows through to /api/word/today. 403 backend + 27 fe.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-23 06:08:14 -04:00
thejayman77 67d4bc32cb Small joys: Quote of the Day + Word of the Day engines
- quote.py: curated public-domain quote pool (16 seeded, admin-grows), deterministic daily
  pick, lazy AI "what it means" explainer of the real quote (cached). No LLM-invented quotes.
- wotd.py: LLM proposes positive words → validated/enriched against dictionaryapi.dev (real
  definition, IPA, examples, audio) → audio clip cached to our origin (TTS fallback) →
  deterministic daily pick. Tops the pool up toward 30/day.
- db.py: quote_pool/daily_quote + wotd_pool/daily_wotd tables.
- api.py: /api/quote/today, /api/word/today, /api/word/audio/{word} (GET+HEAD).
- cli.py: cycle steps for both (under --no-joys), shared LLM client.
- tests: test_quote.py (6) + test_wotd.py (5). 393 backend tests green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-22 17:28:55 -04:00
thejayman77 a7da8362ab Small joys backend: shared daily framework + On This Day engine
- goodnews/daily.py: shared helpers for the daily "small joys" (http_json, date-seeded
  deterministic pick, dedup key) so each joy is a small self-contained module.
- goodnews/onthisday.py: harvest today's MM-DD from Wikimedia's On-this-day feed →
  tone-filter to good/neutral (keyword floor + optional LLM refine) → pool → deterministic
  daily pick (idempotent, respects blocked/featured) → cached row. Network/LLM before any
  DB write. Multi-source ready (source column).
- db.py: onthisday_pool + daily_onthisday tables.
- api.py: GET /api/onthisday/today (edge-cacheable).
- cli.py: cycle step (run after Daily Art; --no-joys to skip), LLM client for tone refine.
- tests/test_onthisday.py: 7 tests (filter+dedup, pick idempotent, blocked/featured,
  never-empty, empty-pool, LLM-narrow). 382 backend tests green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-22 16:51:29 -04:00
thejayman77 db967bb7fa Daily Art: Codex guardrails (atomic image, attribution/license, blocked lever)
Hardening before it runs further on the cycle:
- DB-lock/network: all HTTP (metadata + image) happens before any write; the write txn
  opens only at the brief INSERT and commits immediately. Images download to a temp file
  then atomic os.replace into cache (a reader never sees a half-written file).
- Site-timezone "daily" already used local_today() (same rhythm as the Brief) — confirmed.
- Attribution from day one: store + return title/artist/date/medium/department/credit/
  source_url/object_id/source + museum name + is_public_domain license marker + the full-
  res source URL (for a richer /art view later). UI can show: Title · Artist · The Met.
- "highlight != always beautiful": added a manual `blocked` flag on art_pool (excluded
  from picks) as the cheap curation lever; a featured override can follow.

Schema migrated (existing art tables get the new columns). 373 tests green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-21 15:28:38 -04:00
thejayman77 308516a263 Daily Art backend: curated Met pool, daily cached pick, /api/art (prototype)
The engine for the /art room (design-independent; deploy held for Codex review).

- goodnews/art.py: harvest a curated pool of public-domain HIGHLIGHT artworks from the
  Met (isHighlight+isPublicDomain+hasImages -> masterworks, never potsherds; CC0). Daily
  deterministic pick from the least-recently-shown (no soon-repeats, same for everyone),
  fetch metadata + download the image to OUR cache (data/art_cache) so the homepage never
  waits on or hotlinks the museum. Bulletproof: bad object/image falls through candidates;
  a failed day keeps the last piece (room never empty). Injectable HTTP for tests.
- Schema: art_pool + daily_art. /api/art/today (edge-cacheable) + /api/art/image/{id}
  (served from cache, immutable). CLI `art [--harvest] [--force]` + a non-fatal cycle step.
- Tests (5, mocked HTTP) + verified live against the Met: harvested 1641 works,
  picked/cached "Repose" by John White Alexander. 371 tests green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-21 14:50:20 -04:00
thejayman77 1c05554a28 Geo Stage 1-2: subject-geography model + classifier + pipeline wiring
"Closer to Home" foundation (audit greenlit by Codex). Durable geography, kept
decoupled from volatile scoring.

- Schema: article_geo (breadth/confidence/rationale/geo_version) + article_places
  (0..N ISO-coded places), separate from article_scores so re-runs/audits never
  disturb scoring or acceptance. "local" is never stored — it's relative to the
  reader; the UI computes "Near you" later.
- geo.py: LLM proposes place NAMES, code disposes to ISO codes (country alpha-2,
  US state 2-letter); region words like "Europe" can never become a country.
  'global'/placeless is first-class, not failure. Confidence calibrated so 'high'
  needs an explicit location. Geo is its OWN LLM pass, not merged into the scoring
  prompt (durable metadata, re-runnable, keeps the sensitive prompt untouched).
- store_geo replaces places (geo is re-derivable, unlike scores). tag_articles is
  idempotent by geo_version, only touches accepted non-duplicate articles.
- CLI `geo` command (cycle-locked, --limit/--reclassify) for backfill, plus a
  bounded geo step in the cycle (--geo-limit 60, --no-geo). scripts/geo_audit.py
  is the prototype audit tool.

360 tests green; live smoke tagged real articles correctly (Gaza->PS, London->GB,
placeless science->global). No UI / SEO pages yet — ranking/personalization only.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 16:56:49 -04:00
thejayman77 89c0fbe1f6 Sync repo to deployed state: SEO recovery, Publishing Desk, Play games, emoji picker
The deploy pipeline runs from the working tree, so a wave of shipped features
had never been committed. This snapshots git to what's actually running.

SEO impression recovery (live + verified):
- Duplicate /a/{id} now 301-redirect to their canonical twin instead of 404
  (a hard 404 silently dropped already-indexed URLs and tanked impressions).
- Dedup representative selection reworked: accepted/serveable -> established
  rep (URL stability) -> quality score, so an accepted page never retires to a
  rejected rep and an indexed canonical doesn't churn when a newer twin arrives.
- HEAD /a/{id} returns the same status as GET (api_route GET+HEAD) instead of
  falling through to the static mount and 404ing.
- `dedup --force-recluster`: cycle-locked, model-free re-cluster to re-apply the
  policy to the existing corpus (shared cycle_lock context manager).
- CLI honors GOODNEWS_DB for its default --db (was silently ignored).

Publishing Desk (admin tool to post highlights to X via Web Intents):
- publishing.py queue/rank/handle-resolution; admin UI; full searchable emoji
  picker (bundled data, no CDN) for the blurb editor.

Play games + site:
- Bloom (word-wheel), Memory Match, daily ritual set, Zen Den (dev-gated).
- English-only language gate; source prospecting; paywall + dedup hardening.

Tests: full suite green (349). Ignores tightened (node_modules, data/*.db).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-18 11:32:27 -04:00
thejayman77 2dbe73430c Sources: per-source paywall override (3-state) — fix domain-rule mis-flags
The Articles inspector revealed paywall is domain-coarse: nytimes.com is flagged,
so NY Times Learning's free Word-of-the-Day inherits 🔒 — and that flag isn't
cosmetic, it deprioritizes the content in feed sort + lead selection. Add a
per-source override so admins can correct it after inspecting.

- sources.paywall_override: NULL (domain rule) | 'free' | 'paywalled'.
- paywall.py: keep low-level is_paywalled(url) (domain); add is_paywalled_for_source
  (url, override) for the EFFECTIVE decision — never patched the domain helper
  globally (per Codex), so "domain says X" stays distinguishable from "overridden".
- Threaded everywhere ranking/UI touches paywall, via src.paywall_override on the
  shared _ARTICLE_COLUMNS + the source-aware helper: feed sort, /api/since, replace,
  lead selection, Article badge, brief composition (briefs.py), digest, source_health
  (table 🔒), the Articles inspector, and the review/attention check — so ranking and
  UI always agree.
- Endpoint POST /api/admin/sources/{id}/paywall {override}; admin UI: a select in the
  inspector header (Use domain rule / Treat as free / Treat as paywalled) + the basis
  ("ON (domain)" / "OFF (override)"), optimistic so the panel stays open.

Test: domain rule → paywalled in table+inspector+feed badge; 'free' → off in all
three; validation 422 + 404. 242 pytest + 11 vitest.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-12 22:10:44 -04:00
thejayman77 dd0df64d76 Games: cross-device sync + overlap colour-blend
Two game polish items:

- Word Search: overlapping cells now multiply-blend the crossing words' colours
  (deepening to a darker shade with readable text) instead of the newest colour
  stomping the rest — matches the new interlocking grids.

- Cross-device game-state sync (signed-in): per-puzzle progress + stats now
  follow you between devices. New game_state table; server-side merge on every
  save so two devices converge regardless of push order, tailored per game:
  * Word Search → UNION of finds (monotonic; can't un-find), earliest start,
    best completion time.
  * Word → furthest-progress wins (terminal beats in-progress; more guesses
    beats fewer) — picks one device's game whole, never splices guesses.
  Stats (streak/distribution/best) derived server-side from the synced states,
  so they're consistent instead of per-device counters. Endpoints GET/PUT
  /api/games/state + GET /api/games/stats (signed-in; size-capped). Frontend is
  local-first: games paint instantly from localStorage, then reconcile in the
  background; both game components push debounced on each move and adopt the
  merge. Conflict handling unit-tested + an API two-device convergence test.

235→ tests + 11 vitest green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-12 13:35:20 -04:00
thejayman77 2461584052 Pool admin: delete any word (tombstones + restore) + bulk import
Daily Word pool curation, full add/delete/import — no redeploys to fix tone:
- Remove ANY pool word, curated or admin-added, via a word_pool_removed
  tombstone table. Runtime pool = (static ∪ added) − removed, so even a
  baked-in word can be pulled on negative feedback. Reversible: a "Removed"
  list with one-tap Restore lifts the tombstone. Lookup now surfaces a Remove
  button when in-pool, Restore when removed.
- Import a vetted list (paste or .txt/.csv upload, read client-side): validates
  each word (alpha · 5–6 · in guess dictionary), ignores duplicates, and reports
  rejects with reasons. Re-adding/importing a removed word lifts its tombstone.
- Word Search theme delete already existed (Edit/Remove per theme) — verified.

Pool stays the clean 251/224; today's noisy LLM enrichment is discarded.
Tests: +tests/test_pool_admin.py, extended test_word_pool_admin. 222 pytest +
11 vitest green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-11 17:17:16 -04:00
thejayman77 f71e760847 Admin: Word Search theme authoring + tidy word-pool chips
* New "Word Search themes" panel in the Games tab: enter a theme name + words,
  with live validation (4–8 letters, alpha, deduped) and a count vs the 28 needed
  to fill all three sizes. An " Suggest a word" button asks the LLM for one
  fresh word that fits the theme. Save/edit/remove; authored themes join the daily
  fallback rotation alongside the curated ones (wordsearch_themes table). The
  system still handles word distribution across sizes + placement.
* Daily Word pool's added-word chips now scroll within a bounded area so the
  console stays tidy as the list grows.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 13:36:07 -04:00
thejayman77 61f575ba6d Observability + warming guardrails (Codex)
* client_error details, not just a count: new client_errors table + POST
  /api/client-error (reason/path/user-agent/time) + GET /api/admin/client-errors.
  The boot-seatbelt beacon now sends the reason + path (once per page); the admin
  Overview lists the recent errors so we can tell chunk vs SW vs API vs JS — the
  truth meter for the next day as the new SW propagates.
* Deploy warming now also hits the shell, routes (/play /account /admin), SW,
  version.json, word lists, and icons/logo/font — not just immutable chunks.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 12:31:32 -04:00
thejayman77 903b27fc8d Admin: Daily Word pool curation (lookup + add/remove)
First games admin tool. A "Games" tab in the operator console for the Daily Word
answer pool.
* Lookup: is a word real (in the guess dictionary), the right length (5/6), and
  already in the pool — instant as you type.
* Add: appends to the pool, enforcing the invariant (alpha · 5/6 letters · in the
  guess dict) so the daily answer is always guessable. Remove: drops admin-added
  words (curated static ones stay).
* Additions persist in a new word_pool table (survives redeploys, unlike the
  baked-in JSON); the daily picker reads static pool ∪ DB additions. Guess dicts
  shipped with the package (goodnews/data/words-5/6.json) for server-side
  validation. Admin-gated endpoints + tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 11:42:52 -04:00
thejayman77 215a5c4d64 Play hub + Daily Word game (Phase 1 of the games feature)
A calm /play space — "after the brief, a small thing to enjoy." Framework-ready
for more games (Word Search next; zen/coloring later).

* Daily Word (5 letters / 6 guesses) + Long Word (6 / 7) — same Wordle mechanic,
  Upbeat Bytes flavor (no "Wordle" in the UI). Hopeful answers; after solving, a
  one-line "why this word matters."
* LLM proposes, code disposes: answers are picked deterministically by date-seed
  from a hand-curated hopeful pool that's pre-validated ⊆ the guess dictionary
  (always typeable), avoiding recent repeats; the LLM only adds the optional
  "why" (with fallback). daily_puzzles(date, game, variant, payload) stores them
  so everyone gets the same daily; the cycle pre-generates with the "why".
* Bundled guess dictionaries (words-5/6.json, ~12.6k/22.4k) for client-side guess
  validation — never the LLM. Answer lightly obfuscated (base64) in the payload.
* Private, gentle stats (played/solved/streak, guess distribution); spoiler-free
  emoji-grid share. No leaderboard, no timer, no streak-loss drama.
* Play in the bottom nav (replacing Browse, still on the lane rail) + the header.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 16:06:20 -04:00
thejayman77 337dc3f901 Article pages: structured "Why it belongs" editorial read
Per Codex — make /a/<id> feel like Upbeat Bytes has editorial judgment, not just
a summary wrapper. Trust-building, short, not an essay.

* article_summaries gains what_happened / why_matters / why_belongs (+ migration).
* summarize.explain_article: a separate, fallback-able LLM pass producing three
  short notes (parsed from a labelled WHAT/MATTERS/BELONGS format). generate_summary
  now stores them alongside the summary, and tops up older summaries on next view.
  get_explanation returns them only when all three are present.
* API: share_page + /api/summary expose the explanation.
* share.py: renders the three-part section (accent rule) when complete; otherwise
  the single "Why it's here" reason line is the calm fallback. The page polls and
  swaps in both the summary and the section as they cache.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 20:05:26 -04:00
thejayman77 d8e246b4ff Follow source/topic — account-backed personalization (v1)
Per Codex — turn accounts into a real reason to return, without an algorithmic
feed. Durable interests (sources + tags), not moods.

* DB: user_follows (user_id, kind source|tag, value, unique).
* queries.feed gains follow_sources/follow_tags → the Following feed is
  "articles from a followed source OR carrying a followed tag", still respecting
  calm filters/boundaries.
* API: GET/POST/DELETE /api/follows (sign-in required; source ids validated);
  /api/feed?following=true resolves the user's follows (anon → empty, not error).
* Frontend: follows store (followKeys + toggleFollow, mirrors savedIds); a
  Follow button on source + tag/topic views; a "Following" lane in the nav with
  a tailored empty state; a Following management section in Account (unfollow).

Digest "From what you follow" deferred to v2 (brief stays first).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 17:34:46 -04:00
thejayman77 cf5cbb33c0 Daily digest (opt-in) + finite "you're caught up" ending
Reader-retention as ritual, not capture (Codex's framing). Opt-in calm morning
email of today's brief; the on-site twin is the finite end-of-feed nudge.

* Schema: users.digest_enabled + digest_unsub_token; digest_sends (dedupe +
  visibility). auth.get_user now returns the digest fields.
* goodnews/digest.py: build (dated calm subject, items w/ summary + "why it's
  here" + UB/source links + one-click unsubscribe, "you're caught up" sign-off)
  and send_due_digests (morning-window gated, >=4-item floor or skip quietly,
  deduped, reuses SMTP). No streaks/urgency/"you missed".
* API: /auth/me exposes digest_enabled; POST /api/account/digest toggle;
  GET /api/digest/unsubscribe (token, no login, calm confirmation page).
* CLI: cycle gains a morning-gated digest step (--no-digest) + a send-digests
  command (--force).
* Frontend: digest toggle on the Account profile; the Highlights end-cap now
  says "you're caught up — see you tomorrow" with a one-tap "Get tomorrow's
  brief by email" (signed-in → enable; anon → sign in).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 16:17:46 -04:00
thejayman77 38abc26ddd Honor Retry-After on HTTP 429 (polite rest, not a failure)
Per Codex's spec — a publisher saying "slow down" shouldn't make a feed look
broken, but repeated 429s stay visible via last_success_at / stale-source.

* Schema: sources.retry_after_at (nullable) + migration.
* feeds.parse_retry_after: delta-seconds OR HTTP-date → UTC stamp; ignores
  invalid/negative/past; caps at now + MAX_BACKOFF_MINUTES.
* fetch_feed raises RateLimited (carrying the parsed time) on a 429.
* poll_source: on 429 set retry_after_at + last_error, status='rate_limited',
  and do NOT increment consecutive_failures; on success clear retry_after_at;
  non-429 failures unchanged.
* due_source_rows requires BOTH the streak backoff elapsed AND retry_after_at
  passed (i.e. the later of the two).
* Admin: source_health returns retry_after_at; status reads
  "rate-limited · rests until …" rather than "failed/resting".

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 10:47:40 -04:00
thejayman77 9ed817c051 Source Retire lifecycle (Phase 1: status + content_visible, active mirrored)
Per Codex's plan — introduce a lifecycle without a risky "change the source of
truth everywhere" moment.

* Schema: sources.status (active|paused|retired) + content_visible; migration
  backfills status from active (active=1→active, else paused), content_visible=1.
* `active` is kept as a SYNCED MIRROR: status active→active=1, paused/retired→0,
  so the scheduler/CLI/legacy code keep working unchanged.
* Retire stops polling but keeps articles visible (non-destructive). Hiding is a
  separate, reversible lever: content_visible=0 drops a source's articles from
  the public feed + brief (read AND build), behind a confirm. Personal saved/
  history are untouched.
* API: /sources/{id}/status (validates, mirrors active) + /visibility, replacing
  /active. source_health returns status + content_visible.
* Admin: status column (active/paused/retired + "hidden"), Retired filter,
  Pause/Resume · Retire/Restore · Hide/Show actions.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 09:58:15 -04:00
thejayman77 0f8d5b555a Feedback reply: light Markdown formatting (bold / bullets / heading)
Per Codex: a constrained Markdown-ish composer rather than contenteditable.

* goodnews/markup.render_reply_html — escapes everything first, then introduces
  only a tiny whitelist (**bold**, - bullets, #/##/### headings, paragraphs,
  line breaks). No links, attributes, inline styles, or raw HTML passthrough.
* feedback_replies.message_html column (+ live migration); replies store both
  the Markdown text and the rendered HTML.
* email_send.send_feedback_reply now sends multipart text/plain + text/html
  (the sanitized render, wrapped in a trusted email template).
* Frontend: textarea + a small toolbar (Bold / • List / H) that inserts
  Markdown; the reply thread renders the server-sanitized HTML.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 15:27:46 -04:00
thejayman77 84fd61bf3f In-site feedback reply (plain-text v1)
Reply to a reader from the admin inbox instead of a mailto. Per Codex: keep v1
plain text (no rich editor — defers the user's bold/bullets ask as a fast-follow).

* DB: feedback_replies table (feedback_id, user_id, message, sent_to, sent_at),
  created on the live DB.
* email_send.send_feedback_reply: plain-text "Re: Your Upbeat Bytes feedback"
  with a quoted context block, no analytics/account details.
* API: POST /api/admin/feedback/{id}/reply — admin-gated, requires the feedback
  exists (404) and has a contact_email (400), trims+caps the message; sends via
  SMTP and only records the reply on success (502 on send failure so the UI keeps
  the draft); marks the item read. Feedback list now includes each item's replies.
* Frontend: inline composer (Send/Cancel, sending state, error keeps draft) +
  reply thread under the message; Reply only shows when there's an address,
  else "No reply address".

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 14:23:56 -04:00
thejayman77 ecaca35977 Feedback inbox: read/unread + delete
Make the admin Feedback section a real inbox.

* DB: feedback.read_at column (schema + idempotent migration).
* API: feedback list returns read_at; POST /api/admin/feedback/{id}/read
  {read} toggles it; DELETE /api/admin/feedback/{id} removes a message
  (both admin-gated). admin_stats gains feedback_unread; the Attention strip
  and the tab badge now count UNREAD, not total.
* Frontend: unread messages are highlighted with an accent rail + dot; an
  Unread filter joins the category chips; each message has Mark read/unread
  and Delete (confirm), with optimistic updates that revert on failure.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 13:33:24 -04:00
thejayman77 427210ac3e User feedback + expanded privacy-respecting admin stats
Feedback:
- feedback table; POST /api/feedback (anonymous-ok, optional category/email,
  honeypot + per-day flood cap) stores + emails the admin; GET /api/admin/feedback.
- Shared feedback store + FeedbackModal; a speech-bubble opens it from the desktop
  header, the mobile top bar (logo moves left), the footer, and /account. Feedback
  section in /admin.

Stats (additive, same privacy model — no IP/UA/referrer/raw terms):
- Event vocab: summary_viewed (fired on /a load), full_story (card → source),
  not_today/less_like_this/hide_topic, replace_used/replace_none, paywall_replace,
  paywalled_source_open. Card title/image opens /a (no double-count); history
  records via keepalive so it survives the nav.
- Dashboard: Accounts card (counts only), reading funnel (summary→source rate),
  emotional-mix & friction, paywall, returning-visitor buckets. (Health metrics
  deferred to a future monitoring dashboard.) 131 tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 12:58:49 +00:00
thejayman77 762f121320 Admin step B: stats endpoint + /admin dashboard
- users.is_admin (+ migration); admin = is_admin OR email in GOODNEWS_ADMIN_EMAILS
  (normalized). is_admin exposed on /api/auth/me. Server-authorized GET
  /api/admin/stats (403 for non-admins).
- queries.admin_stats: visitors (today/7d/30d), returning vs one-and-done, top
  opened articles, popular groupings + topics (derived from article_id at query
  time), share breakdown, daily opens/visits trend — all aggregate, no PII.
- /admin page (gated, redirects non-admins): stat cards, CSS bar lists, a daily
  trend; "Admin dashboard" link on /account for admins. 129 tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 18:25:46 +00:00
thejayman77 1a778e1334 Admin step A: privacy-respecting first-party event logging
- events table (kind, article_id, visitor_hash, day) with a UNIQUE key that dedups
  to one row per visitor-day — caps volume and makes counts mean distinct
  visitor-days. NO ip/ua/referrer/url. Groupings derived from article_id at query
  time, never stored.
- POST /api/events (public): whitelisted kinds (visit/open/share_ub/copy_source/
  native_share/source_click); visitor token hashed server-side (never raw).
- Frontend analytics.js: random localStorage visitor token; track() via sendBeacon;
  visit once/day; open on article click; share_ub/copy_source/native_share from the
  share menu; /a landing pages fire source_click. 127 tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 18:21:49 +00:00
thejayman77 1d71575982 Share pages: lazy, cached, our-own-words article summaries
The /a/<id> page now carries an original short summary so it stands on its own,
without republishing the publisher's article:
- summarize.py: transient SSRF-guarded fetch of the article text → local LLM
  writes a 2-4 sentence ORIGINAL summary (our words). Cached in article_summaries
  forever; we store only our summary, never the body. Generated lazily (only for
  shared/viewed articles), de-duped so concurrent hits don't double-generate.
- /a serves cached-or-pending; when pending it shows a calm "summary on its way,
  read at {source}" note and self-polls /api/summary/<id>, swapping the summary
  in the moment it's ready (never blocks the page on the batch-tier LLM).
- Share menu warms generation on open so recipients usually get the rich version.
- Container reaches the arbiter at arbiter:8080 over caddy_web (LLM env added to
  the API container). 124 tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 18:08:40 +00:00
thejayman77 15728c3bcb User avatar (Google picture), avatar in mobile You tab, /account page
- Capture the Google profile picture (picture claim) into users.avatar_url; an
  Avatar component shows it, falling back to the initial. Used in the desktop
  header and the mobile "You" tab (which now shows the user when signed in).
- Move account/settings to its own route /account (robust + scrolls to top),
  reached by the desktop avatar and the mobile You tab; drop the inline "You"
  sheet. AccountPanel gains a Sign out action; the page links to Saved/History/
  Boundaries via home intent params (?view= / ?open=).
- db: users.avatar_url (schema + idempotent migration). 118 tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 14:41:43 +00:00
thejayman77 6a514aa56b Accounts Phase 1 foundation: schema + WAL, auth core, email sender
Groundwork for self-hosted accounts (magic link + Google later), no third parties.

- db: account tables (users, identities, login_tokens, sessions, saved_articles,
  user_history, user_prefs); identities link multiple sign-in methods to one user
  by verified email. connect() now enables WAL + busy_timeout so the API can write
  account data alongside the host ingestion cycle.
- auth.py: users/identities (find-or-create + link), single-use magic-link tokens,
  opaque sessions — all secrets stored only as SHA-256 hashes.
- email_send.py: minimal STARTTLS SMTP sender + the magic-link email.

Secrets (SMTP, Google, session) live in the API container's env_file, not git.
API endpoints + sign-in UI come next. 105 tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 01:02:24 +00:00
thejayman77 a47a1504c8 Phase B1: multi-tag groupings model (backend)
Three-layer organization: primary topic (one per article, for ranking and
brief balance) + grouping tags (1-4 per article from a controlled vocabulary,
the organic "wandering" axis) + tonal flavor.

- taxonomy: add technology + learning topics; 4 calm tag families
  (Discovery & Wonder, People & Kindness, Solutions & Progress, Mind & Craft)
  defined in code, not the DB; ALLOWED_TAGS union + coerce_tags validation.
- db: article_tags(article_id, tag) join table + tag index.
- llm: tags added to the classifier json_schema (enum-constrained, maxItems 4)
  and system prompt; normalize_scores coerces tags; upsert_article_score
  replaces a row's tags atomically on every (re)classification.
- queries: feed gains a tag filter and exposes tags via group_concat; tag_counts.
- api: Article.tags, feed tag param, and /api/families with per-tag counts.
- tests: coerce/normalize/upsert/tag-filter/reclassify-replace/tag_counts +
  /api/families. 99 passing.

Corpus reclassify (re-tag + new primary topics) runs separately against the
local LLM. Frontend (B2) pairs with this; the live site is unchanged until then.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 18:35:25 +00:00
thejayman77 9e8eddf46d Bounded hero-image enrichment (og:image for brief items only)
The grid stays typographic; the hero is the one intentional visual slot. At
brief-build time we fetch a hero-quality image for the daily five that lack one:
- enrich.py reads ONLY a page's <head> og:image/twitter:image and stores just
  the URL (never the body).
- SSRF-guarded: http(s) only, 6s timeout, 300KB cap, <=3 manual redirects each
  re-validated, and hosts rejected if any resolved address is private, loopback,
  link-local, multicast, reserved, or unspecified.
- image_checked_at column caches success AND failure, so an article is never
  retried forever.
- Wired into build-brief and cycle (brief items only, only if image missing and
  unchecked). Everything else stays metadata-only.
- Verified live: today's five all carry images (feed + enriched).

Tests: og:image parser, head-only scope, IP guard across internal ranges, and
enrich success + failure-caching (85 total).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 12:37:41 +00:00
thejayman77 1e190c5e88 Advisory source health: review flags, never auto-deactivate
- Add source health columns (last_success_at, last_error_at, last_error,
  consecutive_failures, review_flag, review_reason) via SCHEMA + migration.
- poll_source maintains them: success resets the failure streak and records the
  success time; failure increments it and stores the latest error.
- review_sources() flags active sources that are stale, repeatedly failing,
  low-acceptance, duplicate-heavy, or doom-skewed (high cortisol/ragebait) over
  a recent window. It is purely advisory: it sets review_flag/review_reason and
  never changes the active column (human stays in the loop), clearing the flag
  when a source recovers.
- CLI review-sources; cycle runs it as a final step (--no-review to skip);
  source-report shows a review line for flagged feeds.
- Tests: healthy/failing/stale/low-acceptance/recovery and never-deactivates.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 20:28:35 +00:00
thejayman77 aa4125ddec Supervised source candidates: stage, list, promote, reject
- New source_candidates staging table (status suggested/quarantined/rejected/
  promoted, preview_json snapshot) so untrusted/suggested feeds stay out of the
  real ingestion path until reviewed.
- sources.py: save_candidate (re-preview never revives a curator's rejection),
  list_candidates, reject_candidate, promote_candidate (copies into sources,
  inactive by default — active on approval; never automatic).
- CLI: suggest-source / list-candidates / promote-candidate / reject-candidate.
- API: read-only GET /api/candidates (writes stay CLI-only — no unauthenticated
  public write surface yet).
- Fix deprecated ElementTree truth-value test in _parse_rss.
- Tests: candidate lifecycle (save/list/promote/reject, status preservation,
  name derivation) — 51 total.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 19:52:40 +00:00
thejayman77 5d44072fca Add semantic cross-source dedup via local embeddings
- LocalModelClient.embed() calls the OpenAI-compatible /embeddings endpoint
  (local nomic model); base_url shared with chat, model via GOODNEWS_EMBED_MODEL.
- New article_embeddings table and articles.duplicate_of column (+ migration).
- dedup module: embeds missing articles, clusters near-identical stories within
  a date window by cosine similarity (pure-stdlib, vectors normalised once), and
  marks all but the highest-ranked member of each cluster as a duplicate.
- 'dedup' CLI command; cycle now runs poll -> classify -> dedup -> brief.
- Feed and brief queries hide duplicates, so a story carried by multiple
  outlets shows once.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 15:40:55 +00:00
thejayman77 38057d0354 Add topic/flavor categorization and category browsing
- New taxonomy module: single source of truth for 6 topics x 5 flavors,
  shared by the LLM response schema (enum-constrained) and validation.
- Classifier now assigns one topic + one flavor per article; json_schema
  enums force valid values, with coercion as a safety net.
- article_scores gains topic/flavor columns via an idempotent migration.
- New 'list-category' command to browse by topic and/or flavor, ranked by
  composite score.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 11:21:53 +00:00
thejayman77 068073423f Initial commit: goodNews constructive-news ingestion prototype
Local-first RSS/Atom ingestion pipeline with metadata-only storage,
heuristic + local-LLM scoring, and daily brief builder.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 00:48:26 +00:00