Fetcher (the two remaining bugs Codex found):
- Real redirects are now followed. _NoRedirect makes urllib RAISE HTTPError on 3xx, so
the old status-branch was dead code (mocked tests masked it). Handle 301/302/303/307/308
HTTPError as redirects (re-validate the destination); classify 4xx≠429 as PERMANENT
(negative-cached), 429/5xx/network as transient. Real-opener redirect + 404/5xx tests.
- The megapixel ceiling is now enforced: explicit `w*h > _MAX_PIXELS` check BEFORE load()
(Pillow only warns at MAX_IMAGE_PIXELS). Test with a lowered ceiling.
Image-rights policy (per Codex + owner decision — only cache what's cleared):
- sources.image_policy: 'cache' (re-host a downscaled copy — license/permission/PD only),
'remote' (hotlink the publisher's image — the conservative DEFAULT), 'none' (no image).
- newsimg.display_url resolves the display URL per policy; applied in Article.from_row so
feed/brief/history return the right URL, and in share.py (og/twitter still reference the
publisher's own image, never re-hosted). warm() + /api/img both gated on 'cache'.
- Frontend uses the server-resolved image_url (reverted the hardcoded /api/img); the
graceful retry covers remote hotlinks too. Admin: per-source image-policy selector +
POST /api/admin/sources/{id}/image-policy. Default 'remote' → nothing re-hosted until
a source is explicitly cleared.
445 backend + 36 frontend tests pass.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The load-error log had no way to clear reviewed entries. Add a read_at column to
client_errors and a read/unread model mirroring the feedback inbox:
- GET /api/admin/client-errors?show=unread|read|all (default unread; returns id+read)
- POST /api/admin/client-errors/read-all (mark all unread read)
- POST /api/admin/client-errors/{id}/read {read: bool} (per-row toggle)
Headline stat is now "Unread load errors" (admin_stats.client_errors.unread), so the
red badge clears as you triage. Admin UI: Unread/Read/All tabs, a "Mark all read"
button, and a per-row ✓/↩ toggle; reading an entry drops it from the default view.
14-day auto-prune still bounds the table. Tests cover filter, toggle, mark-all,
404, and gating.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- daily_art gains blurb + palette columns (idempotent migration).
- art._palette: Pillow median-cut to ~5 hex colors from the cached image (best-
effort → [] on any failure). art._blurb: a warm 2-3 sentence "what you're
looking at" note grounded in the Met catalogue (title/artist/bio/date/medium/
classification/culture/tags). Prompt leans on context/significance and the
title+tags for subject — explicitly NOT asserting literal composition (figure
counts/poses) it can't see, since the model can't view the image. Markdown
stripped from the output.
- pick_daily generates both (client optional → blurb skipped when absent); cycle
+ art CLI pass an LLM client. /api/art/today exposes blurb + palette.
- Backfilled the last 3 days on host (Veteran / Magnolia Vase / Bierstadt).
- scripts/art_blurb_palette_backfill.py for in-place backfill (no re-pick).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Replaces the gist-based read-time with the SOURCE article's full read time — the
contrast that sells the gist ("calm 1-min version here; ~10 min for the deep dive").
- goodnews/readtime.py: word_count_from_html (strips script/style/nav/header/
footer/form/button/aside furniture before counting) + source_read_minutes
(~225 wpm, 200-word floor, None when extraction looks failed/too thin).
- articles.source_words + read_checked_at columns (count only, never the body;
fits the privacy posture). Idempotent migration.
- enrich.fetch_source_words + enrich_read_times: a bounded, retry-guarded cycle
step (mirrors the image enrichers) that counts words for recent accepted
articles. Only ever writes a real count; never overwrites good with zero. Wired
into the cycle after recent-image enrichment.
- queries: source_words flows through _ARTICLE_COLUMNS; api exposes
source_read_minutes on Article (null when unknown).
- home3: News card shows "Full story · ~N min", hidden entirely when null (no
misleading "1 min").
- Tests: furniture stripping, threshold/rounding, enrich idempotency + no
zero-overwrite, API null handling. 412 backend.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Content quality ("LLM polishes, dictionary anchors"):
- New wotd._polish: rewrites the real dictionary gloss into ONE warm plain
sentence + two clear everyday example sentences, grounded in the real
definition (no invented meanings). Stored in new wotd_pool/daily_wotd columns
gloss + usage, alongside the raw definition/examples which stay the anchor.
- harvest() polishes each new word; pick_daily() lazily polishes + caches back
any older pooled word that lacks a gloss (client threaded through run_daily).
- Admin word-add polishes on insert; re-pick passes an LLM client so quote
meaning / word gloss fill on a forced fresh pick.
- /api/word/today now prefers gloss + usage, falling back to the raw dictionary
def/examples when polish is absent (so it's always safe).
- db._migrate adds gloss/usage to wotd_pool + daily_wotd (idempotent ALTER).
Frontend — /word redesigned to CD's "Editorial Asymmetric": faded oversized
initial bleeding off the right, vertical part-of-speech rail, big Newsreader
word, airy definition, left-ruled italic example sentences, outline Listen
button + date. (Uses our self-hosted Newsreader/Hanken stack rather than the
mockup's Google fonts; the made-up syllable respelling is omitted since we only
have real IPA.)
Tests: _polish parse/trim/cap, harvest stores gloss/usage, pick lazy-polishes
older words, admin gloss flows through to /api/word/today. 403 backend + 27 fe.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Hardening before it runs further on the cycle:
- DB-lock/network: all HTTP (metadata + image) happens before any write; the write txn
opens only at the brief INSERT and commits immediately. Images download to a temp file
then atomic os.replace into cache (a reader never sees a half-written file).
- Site-timezone "daily" already used local_today() (same rhythm as the Brief) — confirmed.
- Attribution from day one: store + return title/artist/date/medium/department/credit/
source_url/object_id/source + museum name + is_public_domain license marker + the full-
res source URL (for a richer /art view later). UI can show: Title · Artist · The Met.
- "highlight != always beautiful": added a manual `blocked` flag on art_pool (excluded
from picks) as the cheap curation lever; a featured override can follow.
Schema migrated (existing art tables get the new columns). 373 tests green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The engine for the /art room (design-independent; deploy held for Codex review).
- goodnews/art.py: harvest a curated pool of public-domain HIGHLIGHT artworks from the
Met (isHighlight+isPublicDomain+hasImages -> masterworks, never potsherds; CC0). Daily
deterministic pick from the least-recently-shown (no soon-repeats, same for everyone),
fetch metadata + download the image to OUR cache (data/art_cache) so the homepage never
waits on or hotlinks the museum. Bulletproof: bad object/image falls through candidates;
a failed day keeps the last piece (room never empty). Injectable HTTP for tests.
- Schema: art_pool + daily_art. /api/art/today (edge-cacheable) + /api/art/image/{id}
(served from cache, immutable). CLI `art [--harvest] [--force]` + a non-fatal cycle step.
- Tests (5, mocked HTTP) + verified live against the Met: harvested 1641 works,
picked/cached "Repose" by John White Alexander. 371 tests green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
"Closer to Home" foundation (audit greenlit by Codex). Durable geography, kept
decoupled from volatile scoring.
- Schema: article_geo (breadth/confidence/rationale/geo_version) + article_places
(0..N ISO-coded places), separate from article_scores so re-runs/audits never
disturb scoring or acceptance. "local" is never stored — it's relative to the
reader; the UI computes "Near you" later.
- geo.py: LLM proposes place NAMES, code disposes to ISO codes (country alpha-2,
US state 2-letter); region words like "Europe" can never become a country.
'global'/placeless is first-class, not failure. Confidence calibrated so 'high'
needs an explicit location. Geo is its OWN LLM pass, not merged into the scoring
prompt (durable metadata, re-runnable, keeps the sensitive prompt untouched).
- store_geo replaces places (geo is re-derivable, unlike scores). tag_articles is
idempotent by geo_version, only touches accepted non-duplicate articles.
- CLI `geo` command (cycle-locked, --limit/--reclassify) for backfill, plus a
bounded geo step in the cycle (--geo-limit 60, --no-geo). scripts/geo_audit.py
is the prototype audit tool.
360 tests green; live smoke tagged real articles correctly (Gaza->PS, London->GB,
placeless science->global). No UI / SEO pages yet — ranking/personalization only.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The deploy pipeline runs from the working tree, so a wave of shipped features
had never been committed. This snapshots git to what's actually running.
SEO impression recovery (live + verified):
- Duplicate /a/{id} now 301-redirect to their canonical twin instead of 404
(a hard 404 silently dropped already-indexed URLs and tanked impressions).
- Dedup representative selection reworked: accepted/serveable -> established
rep (URL stability) -> quality score, so an accepted page never retires to a
rejected rep and an indexed canonical doesn't churn when a newer twin arrives.
- HEAD /a/{id} returns the same status as GET (api_route GET+HEAD) instead of
falling through to the static mount and 404ing.
- `dedup --force-recluster`: cycle-locked, model-free re-cluster to re-apply the
policy to the existing corpus (shared cycle_lock context manager).
- CLI honors GOODNEWS_DB for its default --db (was silently ignored).
Publishing Desk (admin tool to post highlights to X via Web Intents):
- publishing.py queue/rank/handle-resolution; admin UI; full searchable emoji
picker (bundled data, no CDN) for the blurb editor.
Play games + site:
- Bloom (word-wheel), Memory Match, daily ritual set, Zen Den (dev-gated).
- English-only language gate; source prospecting; paywall + dedup hardening.
Tests: full suite green (349). Ignores tightened (node_modules, data/*.db).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The Articles inspector revealed paywall is domain-coarse: nytimes.com is flagged,
so NY Times Learning's free Word-of-the-Day inherits 🔒 — and that flag isn't
cosmetic, it deprioritizes the content in feed sort + lead selection. Add a
per-source override so admins can correct it after inspecting.
- sources.paywall_override: NULL (domain rule) | 'free' | 'paywalled'.
- paywall.py: keep low-level is_paywalled(url) (domain); add is_paywalled_for_source
(url, override) for the EFFECTIVE decision — never patched the domain helper
globally (per Codex), so "domain says X" stays distinguishable from "overridden".
- Threaded everywhere ranking/UI touches paywall, via src.paywall_override on the
shared _ARTICLE_COLUMNS + the source-aware helper: feed sort, /api/since, replace,
lead selection, Article badge, brief composition (briefs.py), digest, source_health
(table 🔒), the Articles inspector, and the review/attention check — so ranking and
UI always agree.
- Endpoint POST /api/admin/sources/{id}/paywall {override}; admin UI: a select in the
inspector header (Use domain rule / Treat as free / Treat as paywalled) + the basis
("ON (domain)" / "OFF (override)"), optimistic so the panel stays open.
Test: domain rule → paywalled in table+inspector+feed badge; 'free' → off in all
three; validation 422 + 404. 242 pytest + 11 vitest.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Two game polish items:
- Word Search: overlapping cells now multiply-blend the crossing words' colours
(deepening to a darker shade with readable text) instead of the newest colour
stomping the rest — matches the new interlocking grids.
- Cross-device game-state sync (signed-in): per-puzzle progress + stats now
follow you between devices. New game_state table; server-side merge on every
save so two devices converge regardless of push order, tailored per game:
* Word Search → UNION of finds (monotonic; can't un-find), earliest start,
best completion time.
* Word → furthest-progress wins (terminal beats in-progress; more guesses
beats fewer) — picks one device's game whole, never splices guesses.
Stats (streak/distribution/best) derived server-side from the synced states,
so they're consistent instead of per-device counters. Endpoints GET/PUT
/api/games/state + GET /api/games/stats (signed-in; size-capped). Frontend is
local-first: games paint instantly from localStorage, then reconcile in the
background; both game components push debounced on each move and adopt the
merge. Conflict handling unit-tested + an API two-device convergence test.
235→ tests + 11 vitest green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Daily Word pool curation, full add/delete/import — no redeploys to fix tone:
- Remove ANY pool word, curated or admin-added, via a word_pool_removed
tombstone table. Runtime pool = (static ∪ added) − removed, so even a
baked-in word can be pulled on negative feedback. Reversible: a "Removed"
list with one-tap Restore lifts the tombstone. Lookup now surfaces a Remove
button when in-pool, Restore when removed.
- Import a vetted list (paste or .txt/.csv upload, read client-side): validates
each word (alpha · 5–6 · in guess dictionary), ignores duplicates, and reports
rejects with reasons. Re-adding/importing a removed word lifts its tombstone.
- Word Search theme delete already existed (Edit/Remove per theme) — verified.
Pool stays the clean 251/224; today's noisy LLM enrichment is discarded.
Tests: +tests/test_pool_admin.py, extended test_word_pool_admin. 222 pytest +
11 vitest green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* New "Word Search themes" panel in the Games tab: enter a theme name + words,
with live validation (4–8 letters, alpha, deduped) and a count vs the 28 needed
to fill all three sizes. An "✨ Suggest a word" button asks the LLM for one
fresh word that fits the theme. Save/edit/remove; authored themes join the daily
fallback rotation alongside the curated ones (wordsearch_themes table). The
system still handles word distribution across sizes + placement.
* Daily Word pool's added-word chips now scroll within a bounded area so the
console stays tidy as the list grows.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* client_error details, not just a count: new client_errors table + POST
/api/client-error (reason/path/user-agent/time) + GET /api/admin/client-errors.
The boot-seatbelt beacon now sends the reason + path (once per page); the admin
Overview lists the recent errors so we can tell chunk vs SW vs API vs JS — the
truth meter for the next day as the new SW propagates.
* Deploy warming now also hits the shell, routes (/play /account /admin), SW,
version.json, word lists, and icons/logo/font — not just immutable chunks.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
First games admin tool. A "Games" tab in the operator console for the Daily Word
answer pool.
* Lookup: is a word real (in the guess dictionary), the right length (5/6), and
already in the pool — instant as you type.
* Add: appends to the pool, enforcing the invariant (alpha · 5/6 letters · in the
guess dict) so the daily answer is always guessable. Remove: drops admin-added
words (curated static ones stay).
* Additions persist in a new word_pool table (survives redeploys, unlike the
baked-in JSON); the daily picker reads static pool ∪ DB additions. Guess dicts
shipped with the package (goodnews/data/words-5/6.json) for server-side
validation. Admin-gated endpoints + tests.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A calm /play space — "after the brief, a small thing to enjoy." Framework-ready
for more games (Word Search next; zen/coloring later).
* Daily Word (5 letters / 6 guesses) + Long Word (6 / 7) — same Wordle mechanic,
Upbeat Bytes flavor (no "Wordle" in the UI). Hopeful answers; after solving, a
one-line "why this word matters."
* LLM proposes, code disposes: answers are picked deterministically by date-seed
from a hand-curated hopeful pool that's pre-validated ⊆ the guess dictionary
(always typeable), avoiding recent repeats; the LLM only adds the optional
"why" (with fallback). daily_puzzles(date, game, variant, payload) stores them
so everyone gets the same daily; the cycle pre-generates with the "why".
* Bundled guess dictionaries (words-5/6.json, ~12.6k/22.4k) for client-side guess
validation — never the LLM. Answer lightly obfuscated (base64) in the payload.
* Private, gentle stats (played/solved/streak, guess distribution); spoiler-free
emoji-grid share. No leaderboard, no timer, no streak-loss drama.
* Play in the bottom nav (replacing Browse, still on the lane rail) + the header.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Per Codex — make /a/<id> feel like Upbeat Bytes has editorial judgment, not just
a summary wrapper. Trust-building, short, not an essay.
* article_summaries gains what_happened / why_matters / why_belongs (+ migration).
* summarize.explain_article: a separate, fallback-able LLM pass producing three
short notes (parsed from a labelled WHAT/MATTERS/BELONGS format). generate_summary
now stores them alongside the summary, and tops up older summaries on next view.
get_explanation returns them only when all three are present.
* API: share_page + /api/summary expose the explanation.
* share.py: renders the three-part section (accent rule) when complete; otherwise
the single "Why it's here" reason line is the calm fallback. The page polls and
swaps in both the summary and the section as they cache.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Per Codex — turn accounts into a real reason to return, without an algorithmic
feed. Durable interests (sources + tags), not moods.
* DB: user_follows (user_id, kind source|tag, value, unique).
* queries.feed gains follow_sources/follow_tags → the Following feed is
"articles from a followed source OR carrying a followed tag", still respecting
calm filters/boundaries.
* API: GET/POST/DELETE /api/follows (sign-in required; source ids validated);
/api/feed?following=true resolves the user's follows (anon → empty, not error).
* Frontend: follows store (followKeys + toggleFollow, mirrors savedIds); a
Follow button on source + tag/topic views; a "Following" lane in the nav with
a tailored empty state; a Following management section in Account (unfollow).
Digest "From what you follow" deferred to v2 (brief stays first).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Per Codex's spec — a publisher saying "slow down" shouldn't make a feed look
broken, but repeated 429s stay visible via last_success_at / stale-source.
* Schema: sources.retry_after_at (nullable) + migration.
* feeds.parse_retry_after: delta-seconds OR HTTP-date → UTC stamp; ignores
invalid/negative/past; caps at now + MAX_BACKOFF_MINUTES.
* fetch_feed raises RateLimited (carrying the parsed time) on a 429.
* poll_source: on 429 set retry_after_at + last_error, status='rate_limited',
and do NOT increment consecutive_failures; on success clear retry_after_at;
non-429 failures unchanged.
* due_source_rows requires BOTH the streak backoff elapsed AND retry_after_at
passed (i.e. the later of the two).
* Admin: source_health returns retry_after_at; status reads
"rate-limited · rests until …" rather than "failed/resting".
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Per Codex's plan — introduce a lifecycle without a risky "change the source of
truth everywhere" moment.
* Schema: sources.status (active|paused|retired) + content_visible; migration
backfills status from active (active=1→active, else paused), content_visible=1.
* `active` is kept as a SYNCED MIRROR: status active→active=1, paused/retired→0,
so the scheduler/CLI/legacy code keep working unchanged.
* Retire stops polling but keeps articles visible (non-destructive). Hiding is a
separate, reversible lever: content_visible=0 drops a source's articles from
the public feed + brief (read AND build), behind a confirm. Personal saved/
history are untouched.
* API: /sources/{id}/status (validates, mirrors active) + /visibility, replacing
/active. source_health returns status + content_visible.
* Admin: status column (active/paused/retired + "hidden"), Retired filter,
Pause/Resume · Retire/Restore · Hide/Show actions.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Per Codex: a constrained Markdown-ish composer rather than contenteditable.
* goodnews/markup.render_reply_html — escapes everything first, then introduces
only a tiny whitelist (**bold**, - bullets, #/##/### headings, paragraphs,
line breaks). No links, attributes, inline styles, or raw HTML passthrough.
* feedback_replies.message_html column (+ live migration); replies store both
the Markdown text and the rendered HTML.
* email_send.send_feedback_reply now sends multipart text/plain + text/html
(the sanitized render, wrapped in a trusted email template).
* Frontend: textarea + a small toolbar (Bold / • List / H) that inserts
Markdown; the reply thread renders the server-sanitized HTML.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Reply to a reader from the admin inbox instead of a mailto. Per Codex: keep v1
plain text (no rich editor — defers the user's bold/bullets ask as a fast-follow).
* DB: feedback_replies table (feedback_id, user_id, message, sent_to, sent_at),
created on the live DB.
* email_send.send_feedback_reply: plain-text "Re: Your Upbeat Bytes feedback"
with a quoted context block, no analytics/account details.
* API: POST /api/admin/feedback/{id}/reply — admin-gated, requires the feedback
exists (404) and has a contact_email (400), trims+caps the message; sends via
SMTP and only records the reply on success (502 on send failure so the UI keeps
the draft); marks the item read. Feedback list now includes each item's replies.
* Frontend: inline composer (Send/Cancel, sending state, error keeps draft) +
reply thread under the message; Reply only shows when there's an address,
else "No reply address".
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Make the admin Feedback section a real inbox.
* DB: feedback.read_at column (schema + idempotent migration).
* API: feedback list returns read_at; POST /api/admin/feedback/{id}/read
{read} toggles it; DELETE /api/admin/feedback/{id} removes a message
(both admin-gated). admin_stats gains feedback_unread; the Attention strip
and the tab badge now count UNREAD, not total.
* Frontend: unread messages are highlighted with an accent rail + dot; an
Unread filter joins the category chips; each message has Mark read/unread
and Delete (confirm), with optimistic updates that revert on failure.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Feedback:
- feedback table; POST /api/feedback (anonymous-ok, optional category/email,
honeypot + per-day flood cap) stores + emails the admin; GET /api/admin/feedback.
- Shared feedback store + FeedbackModal; a speech-bubble opens it from the desktop
header, the mobile top bar (logo moves left), the footer, and /account. Feedback
section in /admin.
Stats (additive, same privacy model — no IP/UA/referrer/raw terms):
- Event vocab: summary_viewed (fired on /a load), full_story (card → source),
not_today/less_like_this/hide_topic, replace_used/replace_none, paywall_replace,
paywalled_source_open. Card title/image opens /a (no double-count); history
records via keepalive so it survives the nav.
- Dashboard: Accounts card (counts only), reading funnel (summary→source rate),
emotional-mix & friction, paywall, returning-visitor buckets. (Health metrics
deferred to a future monitoring dashboard.) 131 tests pass.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- users.is_admin (+ migration); admin = is_admin OR email in GOODNEWS_ADMIN_EMAILS
(normalized). is_admin exposed on /api/auth/me. Server-authorized GET
/api/admin/stats (403 for non-admins).
- queries.admin_stats: visitors (today/7d/30d), returning vs one-and-done, top
opened articles, popular groupings + topics (derived from article_id at query
time), share breakdown, daily opens/visits trend — all aggregate, no PII.
- /admin page (gated, redirects non-admins): stat cards, CSS bar lists, a daily
trend; "Admin dashboard" link on /account for admins. 129 tests pass.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- events table (kind, article_id, visitor_hash, day) with a UNIQUE key that dedups
to one row per visitor-day — caps volume and makes counts mean distinct
visitor-days. NO ip/ua/referrer/url. Groupings derived from article_id at query
time, never stored.
- POST /api/events (public): whitelisted kinds (visit/open/share_ub/copy_source/
native_share/source_click); visitor token hashed server-side (never raw).
- Frontend analytics.js: random localStorage visitor token; track() via sendBeacon;
visit once/day; open on article click; share_ub/copy_source/native_share from the
share menu; /a landing pages fire source_click. 127 tests pass.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The /a/<id> page now carries an original short summary so it stands on its own,
without republishing the publisher's article:
- summarize.py: transient SSRF-guarded fetch of the article text → local LLM
writes a 2-4 sentence ORIGINAL summary (our words). Cached in article_summaries
forever; we store only our summary, never the body. Generated lazily (only for
shared/viewed articles), de-duped so concurrent hits don't double-generate.
- /a serves cached-or-pending; when pending it shows a calm "summary on its way,
read at {source}" note and self-polls /api/summary/<id>, swapping the summary
in the moment it's ready (never blocks the page on the batch-tier LLM).
- Share menu warms generation on open so recipients usually get the rich version.
- Container reaches the arbiter at arbiter:8080 over caddy_web (LLM env added to
the API container). 124 tests pass.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Capture the Google profile picture (picture claim) into users.avatar_url; an
Avatar component shows it, falling back to the initial. Used in the desktop
header and the mobile "You" tab (which now shows the user when signed in).
- Move account/settings to its own route /account (robust + scrolls to top),
reached by the desktop avatar and the mobile You tab; drop the inline "You"
sheet. AccountPanel gains a Sign out action; the page links to Saved/History/
Boundaries via home intent params (?view= / ?open=).
- db: users.avatar_url (schema + idempotent migration). 118 tests pass.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Groundwork for self-hosted accounts (magic link + Google later), no third parties.
- db: account tables (users, identities, login_tokens, sessions, saved_articles,
user_history, user_prefs); identities link multiple sign-in methods to one user
by verified email. connect() now enables WAL + busy_timeout so the API can write
account data alongside the host ingestion cycle.
- auth.py: users/identities (find-or-create + link), single-use magic-link tokens,
opaque sessions — all secrets stored only as SHA-256 hashes.
- email_send.py: minimal STARTTLS SMTP sender + the magic-link email.
Secrets (SMTP, Google, session) live in the API container's env_file, not git.
API endpoints + sign-in UI come next. 105 tests pass.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Three-layer organization: primary topic (one per article, for ranking and
brief balance) + grouping tags (1-4 per article from a controlled vocabulary,
the organic "wandering" axis) + tonal flavor.
- taxonomy: add technology + learning topics; 4 calm tag families
(Discovery & Wonder, People & Kindness, Solutions & Progress, Mind & Craft)
defined in code, not the DB; ALLOWED_TAGS union + coerce_tags validation.
- db: article_tags(article_id, tag) join table + tag index.
- llm: tags added to the classifier json_schema (enum-constrained, maxItems 4)
and system prompt; normalize_scores coerces tags; upsert_article_score
replaces a row's tags atomically on every (re)classification.
- queries: feed gains a tag filter and exposes tags via group_concat; tag_counts.
- api: Article.tags, feed tag param, and /api/families with per-tag counts.
- tests: coerce/normalize/upsert/tag-filter/reclassify-replace/tag_counts +
/api/families. 99 passing.
Corpus reclassify (re-tag + new primary topics) runs separately against the
local LLM. Frontend (B2) pairs with this; the live site is unchanged until then.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The grid stays typographic; the hero is the one intentional visual slot. At
brief-build time we fetch a hero-quality image for the daily five that lack one:
- enrich.py reads ONLY a page's <head> og:image/twitter:image and stores just
the URL (never the body).
- SSRF-guarded: http(s) only, 6s timeout, 300KB cap, <=3 manual redirects each
re-validated, and hosts rejected if any resolved address is private, loopback,
link-local, multicast, reserved, or unspecified.
- image_checked_at column caches success AND failure, so an article is never
retried forever.
- Wired into build-brief and cycle (brief items only, only if image missing and
unchecked). Everything else stays metadata-only.
- Verified live: today's five all carry images (feed + enriched).
Tests: og:image parser, head-only scope, IP guard across internal ranges, and
enrich success + failure-caching (85 total).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Add source health columns (last_success_at, last_error_at, last_error,
consecutive_failures, review_flag, review_reason) via SCHEMA + migration.
- poll_source maintains them: success resets the failure streak and records the
success time; failure increments it and stores the latest error.
- review_sources() flags active sources that are stale, repeatedly failing,
low-acceptance, duplicate-heavy, or doom-skewed (high cortisol/ragebait) over
a recent window. It is purely advisory: it sets review_flag/review_reason and
never changes the active column (human stays in the loop), clearing the flag
when a source recovers.
- CLI review-sources; cycle runs it as a final step (--no-review to skip);
source-report shows a review line for flagged feeds.
- Tests: healthy/failing/stale/low-acceptance/recovery and never-deactivates.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- New source_candidates staging table (status suggested/quarantined/rejected/
promoted, preview_json snapshot) so untrusted/suggested feeds stay out of the
real ingestion path until reviewed.
- sources.py: save_candidate (re-preview never revives a curator's rejection),
list_candidates, reject_candidate, promote_candidate (copies into sources,
inactive by default — active on approval; never automatic).
- CLI: suggest-source / list-candidates / promote-candidate / reject-candidate.
- API: read-only GET /api/candidates (writes stay CLI-only — no unauthenticated
public write surface yet).
- Fix deprecated ElementTree truth-value test in _parse_rss.
- Tests: candidate lifecycle (save/list/promote/reject, status preservation,
name derivation) — 51 total.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- LocalModelClient.embed() calls the OpenAI-compatible /embeddings endpoint
(local nomic model); base_url shared with chat, model via GOODNEWS_EMBED_MODEL.
- New article_embeddings table and articles.duplicate_of column (+ migration).
- dedup module: embeds missing articles, clusters near-identical stories within
a date window by cosine similarity (pure-stdlib, vectors normalised once), and
marks all but the highest-ranked member of each cluster as a duplicate.
- 'dedup' CLI command; cycle now runs poll -> classify -> dedup -> brief.
- Feed and brief queries hide duplicates, so a story carried by multiple
outlets shows once.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- New taxonomy module: single source of truth for 6 topics x 5 flavors,
shared by the LLM response schema (enum-constrained) and validation.
- Classifier now assigns one topic + one flavor per article; json_schema
enums force valid values, with coercion as a safety net.
- article_scores gains topic/flavor columns via an idempotent migration.
- New 'list-category' command to browse by topic and/or flavor, ranked by
composite score.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>