The deploy pipeline runs from the working tree, so a wave of shipped features
had never been committed. This snapshots git to what's actually running.
SEO impression recovery (live + verified):
- Duplicate /a/{id} now 301-redirect to their canonical twin instead of 404
(a hard 404 silently dropped already-indexed URLs and tanked impressions).
- Dedup representative selection reworked: accepted/serveable -> established
rep (URL stability) -> quality score, so an accepted page never retires to a
rejected rep and an indexed canonical doesn't churn when a newer twin arrives.
- HEAD /a/{id} returns the same status as GET (api_route GET+HEAD) instead of
falling through to the static mount and 404ing.
- `dedup --force-recluster`: cycle-locked, model-free re-cluster to re-apply the
policy to the existing corpus (shared cycle_lock context manager).
- CLI honors GOODNEWS_DB for its default --db (was silently ignored).
Publishing Desk (admin tool to post highlights to X via Web Intents):
- publishing.py queue/rank/handle-resolution; admin UI; full searchable emoji
picker (bundled data, no CDN) for the blurb editor.
Play games + site:
- Bloom (word-wheel), Memory Match, daily ritual set, Zen Den (dev-gated).
- English-only language gate; source prospecting; paywall + dedup hardening.
Tests: full suite green (349). Ignores tightened (node_modules, data/*.db).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The Articles inspector revealed paywall is domain-coarse: nytimes.com is flagged,
so NY Times Learning's free Word-of-the-Day inherits 🔒 — and that flag isn't
cosmetic, it deprioritizes the content in feed sort + lead selection. Add a
per-source override so admins can correct it after inspecting.
- sources.paywall_override: NULL (domain rule) | 'free' | 'paywalled'.
- paywall.py: keep low-level is_paywalled(url) (domain); add is_paywalled_for_source
(url, override) for the EFFECTIVE decision — never patched the domain helper
globally (per Codex), so "domain says X" stays distinguishable from "overridden".
- Threaded everywhere ranking/UI touches paywall, via src.paywall_override on the
shared _ARTICLE_COLUMNS + the source-aware helper: feed sort, /api/since, replace,
lead selection, Article badge, brief composition (briefs.py), digest, source_health
(table 🔒), the Articles inspector, and the review/attention check — so ranking and
UI always agree.
- Endpoint POST /api/admin/sources/{id}/paywall {override}; admin UI: a select in the
inspector header (Use domain rule / Treat as free / Treat as paywalled) + the basis
("ON (domain)" / "OFF (override)"), optimistic so the panel stays open.
Test: domain rule → paywalled in table+inspector+feed badge; 'free' → off in all
three; validation 422 + 404. 242 pytest + 11 vitest.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A staged candidate could only be renamed by rejecting and re-adding it, which
churns the queue and discards the preview just to fix a typo. Add an inline
Rename on each candidate: a "Rename" pill swaps the name for an input
(Enter saves · Esc cancels), POST /api/admin/candidates/{id}/rename →
sources.rename_candidate(). Empty clears the name (promote then derives one
from the feed host). Preview is preserved; the fixed name carries into promotion.
Tests: test_candidate_rename (rename in place keeps preview, promotes with the
new name, gated + 404). 225 pytest + 11 vitest green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Three follow-ups from Codex's audit of the deep-preview/search/dedup work:
- Promote-time duplicate guard: promote_candidate() now re-checks
find_existing_feed() and raises DuplicateFeedError → 409, so an
old/CLI/direct-DB candidate or a race can't bypass the add-time check and
silently overwrite a live source's settings via upsert. (sources scanned
first, so a real source collision wins over the candidate matching itself.)
- postJSON/putJSON/delJSON gain opt-in {timeout} (AbortController, default
none so other calls are unchanged); deep preview uses 120s and surfaces a
calm "timed out" message instead of pinning the button on "Deep-checking…"
if the LAN model stalls.
- feed_key() now lowercases the host only, not the whole URL — paths/queries
can be case-significant; scheme/www/trailing-slash/host-case still collapse.
Tests: test_candidate_deep_preview_and_dedup extended — promote succeeds once,
then a re-promote of the same candidate is refused 409. 224 pytest + 11 vitest.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Three admin Sources upgrades:
- Deep preview: a per-candidate "🔬 Deep preview" button runs the REAL
classifier on an 8-item sample (the same model that judges live articles),
versus the fast keyword heuristic the add/Re-preview path uses. Preview now
carries `classified`, surfaced as a "model-checked" vs "quick estimate"
badge — so the acceptance % is no longer ambiguously heuristic. conn is
released during the ~30-60s model pass; postJSON has no client timeout.
- Search: free-text box over the sources table (name / category / feed URL /
homepage), folded into the existing status filter, with a live match count
and empty state. Makes "is this already added?" a glance.
- Duplicate-add guard: sources.find_existing_feed() + feed_key() normalize
scheme/www/trailing-slash/case, so re-adding a feed that's already a live
source or a queued candidate is refused with a 409 naming where it lives
(DB already enforced exact-URL uniqueness; this catches the near-miss
variants and overwrite-on-promote footgun).
Tests: test_candidate_deep_preview_and_dedup (deep flag wires the model +
uses the small sample; exact/www/slash/case variants all 409). 224 pytest +
11 vitest green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Per Codex: upsert_sources() wrote `active` but not `status`, so a candidate
promoted inactive (the pipeline default) became active=0 + status='active' —
the exact mirror drift Phase 1 set out to avoid (scheduler won't poll, admin UI
shows "active"). Now derive status from an explicit value or from active, mirror
active off status, and write both columns together (insert + conflict update).
Test: promote_candidate(active=False) → status='paused', active=0.
Also fix stale source_health docstring (now includes retired).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- ArticleCard: derive safeHref from article.url and reset image-failure state
when the article changes, so in-place replacements re-evaluate correctly
(clears the Svelte capture warning; build is warning-free again).
- Downweight paywalled stories below readable ones (stable sort) when composing
the daily five and in feed results — the brief now leads readable and rarely
hands over a locked door.
- review_sources gains a 'paywall-heavy' advisory flag (Nature, New Scientist
flag at 100%); never auto-deactivates.
- New Scientist/Nature kept active but no longer reach the daily five; they
remain browsable with the label + Replace.
- Tests: brief readability preference + paywall-heavy flag (79 total).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Add source health columns (last_success_at, last_error_at, last_error,
consecutive_failures, review_flag, review_reason) via SCHEMA + migration.
- poll_source maintains them: success resets the failure streak and records the
success time; failure increments it and stores the latest error.
- review_sources() flags active sources that are stale, repeatedly failing,
low-acceptance, duplicate-heavy, or doom-skewed (high cortisol/ragebait) over
a recent window. It is purely advisory: it sets review_flag/review_reason and
never changes the active column (human stays in the loop), clearing the flag
when a source recovers.
- CLI review-sources; cycle runs it as a final step (--no-review to skip);
source-report shows a review line for flagged feeds.
- Tests: healthy/failing/stale/low-acceptance/recovery and never-deactivates.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- New source_candidates staging table (status suggested/quarantined/rejected/
promoted, preview_json snapshot) so untrusted/suggested feeds stay out of the
real ingestion path until reviewed.
- sources.py: save_candidate (re-preview never revives a curator's rejection),
list_candidates, reject_candidate, promote_candidate (copies into sources,
inactive by default — active on approval; never automatic).
- CLI: suggest-source / list-candidates / promote-candidate / reject-candidate.
- API: read-only GET /api/candidates (writes stay CLI-only — no unauthenticated
public write surface yet).
- Fix deprecated ElementTree truth-value test in _parse_rss.
- Tests: candidate lifecycle (save/list/promote/reject, status preservation,
name derivation) — 51 total.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>