11 Commits

Author SHA1 Message Date
thejayman77 89c0fbe1f6 Sync repo to deployed state: SEO recovery, Publishing Desk, Play games, emoji picker
The deploy pipeline runs from the working tree, so a wave of shipped features
had never been committed. This snapshots git to what's actually running.

SEO impression recovery (live + verified):
- Duplicate /a/{id} now 301-redirect to their canonical twin instead of 404
  (a hard 404 silently dropped already-indexed URLs and tanked impressions).
- Dedup representative selection reworked: accepted/serveable -> established
  rep (URL stability) -> quality score, so an accepted page never retires to a
  rejected rep and an indexed canonical doesn't churn when a newer twin arrives.
- HEAD /a/{id} returns the same status as GET (api_route GET+HEAD) instead of
  falling through to the static mount and 404ing.
- `dedup --force-recluster`: cycle-locked, model-free re-cluster to re-apply the
  policy to the existing corpus (shared cycle_lock context manager).
- CLI honors GOODNEWS_DB for its default --db (was silently ignored).

Publishing Desk (admin tool to post highlights to X via Web Intents):
- publishing.py queue/rank/handle-resolution; admin UI; full searchable emoji
  picker (bundled data, no CDN) for the blurb editor.

Play games + site:
- Bloom (word-wheel), Memory Match, daily ritual set, Zen Den (dev-gated).
- English-only language gate; source prospecting; paywall + dedup hardening.

Tests: full suite green (349). Ignores tightened (node_modules, data/*.db).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-18 11:32:27 -04:00
thejayman77 38abc26ddd Honor Retry-After on HTTP 429 (polite rest, not a failure)
Per Codex's spec — a publisher saying "slow down" shouldn't make a feed look
broken, but repeated 429s stay visible via last_success_at / stale-source.

* Schema: sources.retry_after_at (nullable) + migration.
* feeds.parse_retry_after: delta-seconds OR HTTP-date → UTC stamp; ignores
  invalid/negative/past; caps at now + MAX_BACKOFF_MINUTES.
* fetch_feed raises RateLimited (carrying the parsed time) on a 429.
* poll_source: on 429 set retry_after_at + last_error, status='rate_limited',
  and do NOT increment consecutive_failures; on success clear retry_after_at;
  non-429 failures unchanged.
* due_source_rows requires BOTH the streak backoff elapsed AND retry_after_at
  passed (i.e. the later of the two).
* Admin: source_health returns retry_after_at; status reads
  "rate-limited · rests until …" rather than "failed/resting".

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 10:47:40 -04:00
thejayman77 1a8d1b3bf1 Promote-candidate UI: add-a-source pipeline in the admin console
Bring the supervised source-candidate flow into Sources (Codex's v1 scope), so
adding feeds no longer needs the CLI.

* feeds.safe_fetch_feed: SSRF-safe fetch for UNTRUSTED (admin-pasted) URLs —
  http(s) only, every redirect hop re-validated via enrich._host_is_public,
  body size-capped, bounded redirects, no cookies. preview_feed gains a
  `fetcher` param; the API path passes safe_fetch_feed (NOT the raw fetch_feed
  used for already-vetted polling).
* API (admin-gated): GET /candidates; POST /candidates (suggest+preview, gated
  before the outbound fetch, no DB conn held during network); /{id}/preview
  (explicit re-preview); /{id}/promote (paused by default, returns the new
  source + updated candidate); /{id}/reject. rejected stays on candidates only.
* Admin Sources tab: "Add a source" field + a candidate queue showing the
  preview (pass rate, recent count, example headlines) with Promote (as paused,
  or Activate immediately) / Re-preview / Reject.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 10:28:00 -04:00
thejayman77 50dc2167cd Durable image quality: stop trusting feed thumbnails; cycle enriches Latest
Make "no blurry images" sustainable, not a one-off cleanup. RSS feed thumbnails
(~44% were ~90px) were stored at ingest and upscaled to mush, so new articles
would reintroduce them. Now image_url is filled ONLY by the quality-gated
og:image enrichment:

* insert_article no longer stores the feed image (was canonicalize_url(item...)).
* enrich_recent_images(): the cycle fetches a quality og:image for the newest
  accepted, imageless articles each run (bounded), keeping Latest photo-rich.
* Brief + on-open enrichment unchanged.

Net: every stored image is a validated, ≥450px og:image; the rest are clean
placeholders.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 15:55:57 -04:00
thejayman77 452e5a3fe4 Hardening pass: scheduler backoff, FK cascade, a11y, test safety net
Pre-traffic cleanup from an audit:

* Scheduler: poll_due_sources now keys on the last *attempt* (success or
  failure), not the last success, and scales the wait by the consecutive-
  failure streak (capped at a day). A failing feed (e.g. Phys.org's HTTP 429s)
  used to be retried every cycle because it had no successful run; it now backs
  off and recovers on its own. Extracted due_source_rows() + tests.

* FK hygiene: deleting a daily_brief is supposed to cascade to its items, but
  SQLite enforces foreign keys per-connection — connect() already sets the
  pragma, so the cascade is correct going forward; added a regression test.
  (Orphaned items + Phys.org settings were cleaned directly on the live DB.)

* a11y: modal/drawer dialogs are now focusable (tabindex), close on Escape
  (window) and on backdrop click via a target check (dropping the inner
  stopPropagation handlers). Build is warning-free.

* tests: conftest points any un-mocked LLM client at a closed port with a 1s
  timeout, so an accidental real call fails fast instead of hanging the suite.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 19:18:18 +00:00
thejayman77 f46fee1197 Typographic-first imagery + opportunistic feed-HTML image extraction
Per the calm north star (images support reading, never become a stimulation
layer; metadata-only stays the posture):
- Image-less cards are now designed, not missing: secondary cards are text-first
  (no empty media band), and an image-less hero becomes a fully typographic lead
  with a faint topic wordmark behind it (CSS attr(data-topic)). No big empty
  image space is ever reserved.
- Opportunistic extraction: parse the first <img src> from a feed's
  content/description HTML when present, canonicalized — never fetching the
  article page. Applies to new ingests (existing rows keep their current image).
- Held by deliberate choice: og:image page enrichment, stock/AI imagery, and any
  image-coverage requirement for sources.

Tests: feed HTML image extraction (72 total).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 23:59:36 +00:00
thejayman77 1e190c5e88 Advisory source health: review flags, never auto-deactivate
- Add source health columns (last_success_at, last_error_at, last_error,
  consecutive_failures, review_flag, review_reason) via SCHEMA + migration.
- poll_source maintains them: success resets the failure streak and records the
  success time; failure increments it and stores the latest error.
- review_sources() flags active sources that are stale, repeatedly failing,
  low-acceptance, duplicate-heavy, or doom-skewed (high cortisol/ragebait) over
  a recent window. It is purely advisory: it sets review_flag/review_reason and
  never changes the active column (human stays in the loop), clearing the flag
  when a source recovers.
- CLI review-sources; cycle runs it as a final step (--no-review to skip);
  source-report shows a review line for flagged feeds.
- Tests: healthy/failing/stale/low-acceptance/recovery and never-deactivates.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 20:28:35 +00:00
thejayman77 aa4125ddec Supervised source candidates: stage, list, promote, reject
- New source_candidates staging table (status suggested/quarantined/rejected/
  promoted, preview_json snapshot) so untrusted/suggested feeds stay out of the
  real ingestion path until reviewed.
- sources.py: save_candidate (re-preview never revives a curator's rejection),
  list_candidates, reject_candidate, promote_candidate (copies into sources,
  inactive by default — active on approval; never automatic).
- CLI: suggest-source / list-candidates / promote-candidate / reject-candidate.
- API: read-only GET /api/candidates (writes stay CLI-only — no unauthenticated
  public write surface yet).
- Fix deprecated ElementTree truth-value test in _parse_rss.
- Tests: candidate lifecycle (save/list/promote/reject, status preservation,
  name derivation) — 51 total.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 19:52:40 +00:00
thejayman77 95195daff8 Track 3: read-only source preview (vet a feed before adding)
- feeds.preview_feed(): fetch + score a sample WITHOUT persisting; returns
  freshness, acceptance rate, cortisol/ragebait/PR averages, and example
  accepted/rejected items. With an LLM client it also returns topic/flavor mix
  and the model's (accurate) acceptance view.
- CLI 'preview-source URL [--sample] [--classify]'.
- API 'GET /api/source-preview?url=&sample=&classify=' with an http(s)-only
  guard (SSRF note left for go-public hardening).
- Site 'Suggest a source' panel with Quick check (heuristic, instant) and Deep
  check (model, accurate), rendered DOM-safely.
- Tests: network-free preview_feed tests via monkeypatched fetch (45 total).
- README documents the command, endpoint, and updated roadmap.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 19:37:34 +00:00
thejayman77 2414fd3ccb Add interval-aware polling and a 'cycle' command for scheduling
- poll_due_sources(): polls only sources whose last successful poll is older
  than their poll_interval_minutes (or never polled), finally giving that
  config field meaning.
- classify gains only_unclassified to spend the LLM solely on new (heuristic)
  articles, so a frequent scheduled run stays cheap.
- 'cycle' command runs poll-due -> classify-new -> rebuild-today's-brief, with
  each step non-fatal so a down model endpoint or empty day never aborts it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 14:13:00 +00:00
thejayman77 068073423f Initial commit: goodNews constructive-news ingestion prototype
Local-first RSS/Atom ingestion pipeline with metadata-only storage,
heuristic + local-LLM scoring, and daily brief builder.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 00:48:26 +00:00