The deploy pipeline runs from the working tree, so a wave of shipped features
had never been committed. This snapshots git to what's actually running.
SEO impression recovery (live + verified):
- Duplicate /a/{id} now 301-redirect to their canonical twin instead of 404
(a hard 404 silently dropped already-indexed URLs and tanked impressions).
- Dedup representative selection reworked: accepted/serveable -> established
rep (URL stability) -> quality score, so an accepted page never retires to a
rejected rep and an indexed canonical doesn't churn when a newer twin arrives.
- HEAD /a/{id} returns the same status as GET (api_route GET+HEAD) instead of
falling through to the static mount and 404ing.
- `dedup --force-recluster`: cycle-locked, model-free re-cluster to re-apply the
policy to the existing corpus (shared cycle_lock context manager).
- CLI honors GOODNEWS_DB for its default --db (was silently ignored).
Publishing Desk (admin tool to post highlights to X via Web Intents):
- publishing.py queue/rank/handle-resolution; admin UI; full searchable emoji
picker (bundled data, no CDN) for the blurb editor.
Play games + site:
- Bloom (word-wheel), Memory Match, daily ritual set, Zen Den (dev-gated).
- English-only language gate; source prospecting; paywall + dedup hardening.
Tests: full suite green (349). Ignores tightened (node_modules, data/*.db).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Per Codex's spec — a publisher saying "slow down" shouldn't make a feed look
broken, but repeated 429s stay visible via last_success_at / stale-source.
* Schema: sources.retry_after_at (nullable) + migration.
* feeds.parse_retry_after: delta-seconds OR HTTP-date → UTC stamp; ignores
invalid/negative/past; caps at now + MAX_BACKOFF_MINUTES.
* fetch_feed raises RateLimited (carrying the parsed time) on a 429.
* poll_source: on 429 set retry_after_at + last_error, status='rate_limited',
and do NOT increment consecutive_failures; on success clear retry_after_at;
non-429 failures unchanged.
* due_source_rows requires BOTH the streak backoff elapsed AND retry_after_at
passed (i.e. the later of the two).
* Admin: source_health returns retry_after_at; status reads
"rate-limited · rests until …" rather than "failed/resting".
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Bring the supervised source-candidate flow into Sources (Codex's v1 scope), so
adding feeds no longer needs the CLI.
* feeds.safe_fetch_feed: SSRF-safe fetch for UNTRUSTED (admin-pasted) URLs —
http(s) only, every redirect hop re-validated via enrich._host_is_public,
body size-capped, bounded redirects, no cookies. preview_feed gains a
`fetcher` param; the API path passes safe_fetch_feed (NOT the raw fetch_feed
used for already-vetted polling).
* API (admin-gated): GET /candidates; POST /candidates (suggest+preview, gated
before the outbound fetch, no DB conn held during network); /{id}/preview
(explicit re-preview); /{id}/promote (paused by default, returns the new
source + updated candidate); /{id}/reject. rejected stays on candidates only.
* Admin Sources tab: "Add a source" field + a candidate queue showing the
preview (pass rate, recent count, example headlines) with Promote (as paused,
or Activate immediately) / Re-preview / Reject.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Make "no blurry images" sustainable, not a one-off cleanup. RSS feed thumbnails
(~44% were ~90px) were stored at ingest and upscaled to mush, so new articles
would reintroduce them. Now image_url is filled ONLY by the quality-gated
og:image enrichment:
* insert_article no longer stores the feed image (was canonicalize_url(item...)).
* enrich_recent_images(): the cycle fetches a quality og:image for the newest
accepted, imageless articles each run (bounded), keeping Latest photo-rich.
* Brief + on-open enrichment unchanged.
Net: every stored image is a validated, ≥450px og:image; the rest are clean
placeholders.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Pre-traffic cleanup from an audit:
* Scheduler: poll_due_sources now keys on the last *attempt* (success or
failure), not the last success, and scales the wait by the consecutive-
failure streak (capped at a day). A failing feed (e.g. Phys.org's HTTP 429s)
used to be retried every cycle because it had no successful run; it now backs
off and recovers on its own. Extracted due_source_rows() + tests.
* FK hygiene: deleting a daily_brief is supposed to cascade to its items, but
SQLite enforces foreign keys per-connection — connect() already sets the
pragma, so the cascade is correct going forward; added a regression test.
(Orphaned items + Phys.org settings were cleaned directly on the live DB.)
* a11y: modal/drawer dialogs are now focusable (tabindex), close on Escape
(window) and on backdrop click via a target check (dropping the inner
stopPropagation handlers). Build is warning-free.
* tests: conftest points any un-mocked LLM client at a closed port with a 1s
timeout, so an accidental real call fails fast instead of hanging the suite.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Per the calm north star (images support reading, never become a stimulation
layer; metadata-only stays the posture):
- Image-less cards are now designed, not missing: secondary cards are text-first
(no empty media band), and an image-less hero becomes a fully typographic lead
with a faint topic wordmark behind it (CSS attr(data-topic)). No big empty
image space is ever reserved.
- Opportunistic extraction: parse the first <img src> from a feed's
content/description HTML when present, canonicalized — never fetching the
article page. Applies to new ingests (existing rows keep their current image).
- Held by deliberate choice: og:image page enrichment, stock/AI imagery, and any
image-coverage requirement for sources.
Tests: feed HTML image extraction (72 total).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Add source health columns (last_success_at, last_error_at, last_error,
consecutive_failures, review_flag, review_reason) via SCHEMA + migration.
- poll_source maintains them: success resets the failure streak and records the
success time; failure increments it and stores the latest error.
- review_sources() flags active sources that are stale, repeatedly failing,
low-acceptance, duplicate-heavy, or doom-skewed (high cortisol/ragebait) over
a recent window. It is purely advisory: it sets review_flag/review_reason and
never changes the active column (human stays in the loop), clearing the flag
when a source recovers.
- CLI review-sources; cycle runs it as a final step (--no-review to skip);
source-report shows a review line for flagged feeds.
- Tests: healthy/failing/stale/low-acceptance/recovery and never-deactivates.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- New source_candidates staging table (status suggested/quarantined/rejected/
promoted, preview_json snapshot) so untrusted/suggested feeds stay out of the
real ingestion path until reviewed.
- sources.py: save_candidate (re-preview never revives a curator's rejection),
list_candidates, reject_candidate, promote_candidate (copies into sources,
inactive by default — active on approval; never automatic).
- CLI: suggest-source / list-candidates / promote-candidate / reject-candidate.
- API: read-only GET /api/candidates (writes stay CLI-only — no unauthenticated
public write surface yet).
- Fix deprecated ElementTree truth-value test in _parse_rss.
- Tests: candidate lifecycle (save/list/promote/reject, status preservation,
name derivation) — 51 total.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- feeds.preview_feed(): fetch + score a sample WITHOUT persisting; returns
freshness, acceptance rate, cortisol/ragebait/PR averages, and example
accepted/rejected items. With an LLM client it also returns topic/flavor mix
and the model's (accurate) acceptance view.
- CLI 'preview-source URL [--sample] [--classify]'.
- API 'GET /api/source-preview?url=&sample=&classify=' with an http(s)-only
guard (SSRF note left for go-public hardening).
- Site 'Suggest a source' panel with Quick check (heuristic, instant) and Deep
check (model, accurate), rendered DOM-safely.
- Tests: network-free preview_feed tests via monkeypatched fetch (45 total).
- README documents the command, endpoint, and updated roadmap.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- poll_due_sources(): polls only sources whose last successful poll is older
than their poll_interval_minutes (or never polled), finally giving that
config field meaning.
- classify gains only_unclassified to spend the LLM solely on new (heuristic)
articles, so a frequent scheduled run stays cheap.
- 'cycle' command runs poll-due -> classify-new -> rebuild-today's-brief, with
each step non-fatal so a down model endpoint or empty day never aborts it.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>