6 Commits

Author SHA1 Message Date
thejayman77 008364e922 Why-it-belongs: top-up requires all three fields (idempotency fix)
Per Codex: generate_summary treated why_belongs alone as a complete explanation,
but get_explanation requires all three — so a partial older row (e.g. only
why_belongs) would never top up and the page would fall back forever. Now the
fully-cached check requires summary + what_happened + why_matters + why_belongs.
Test covers the partial-row top-up.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 20:10:27 -04:00
thejayman77 337dc3f901 Article pages: structured "Why it belongs" editorial read
Per Codex — make /a/<id> feel like Upbeat Bytes has editorial judgment, not just
a summary wrapper. Trust-building, short, not an essay.

* article_summaries gains what_happened / why_matters / why_belongs (+ migration).
* summarize.explain_article: a separate, fallback-able LLM pass producing three
  short notes (parsed from a labelled WHAT/MATTERS/BELONGS format). generate_summary
  now stores them alongside the summary, and tops up older summaries on next view.
  get_explanation returns them only when all three are present.
* API: share_page + /api/summary expose the explanation.
* share.py: renders the three-part section (accent rule) when complete; otherwise
  the single "Why it's here" reason line is the calm fallback. The page polls and
  swaps in both the summary and the section as they cache.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 20:05:26 -04:00
thejayman77 403749e26f Images phase 1: attention-triggered og:image coverage
Tie image enrichment to attention (per review): when an article earns a summary
(i.e. a reader reached it), best-effort fetch a real og:image if it lacks one —
never blanket-fetch every ingested article. Adds:

* enrich_article_image() — single-article fetch, leaves existing images alone,
  retries an imageless article only after 7 days, stamps image_checked_at.
* generate_summary() calls it after caching (wrapped; never breaks summaries).
* enrich_summarized_images() + `goodnews enrich-images` CLI — slow background
  backfill of already-summarized, accepted, imageless articles.
* Quality gate: extend the generic-image skip list with data:/tracking-pixel/
  spacer markers (on top of the existing logo/placeholder + unbranded-BBC logic).

This is coverage only; display (editorial rhythm, tile treatment) comes next.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 12:30:11 -04:00
thejayman77 cfde4e22db Summary briefing layer: Today pre-summarized, /a is the canonical read
Make summaries the core reading experience (summary-first, source-forward):
- Cycle pre-warms summaries for Today's 7 (idempotent → only new ones hit the LLM).
- /api/brief items carry their cached summary; Today cards (hero + tiles) show it
  inline, so Today reads as a calm briefing.
- Card title/image now open the /a summary page (the canonical artifact), with a
  visible "Full story" link straight to the source on every card (the escape hatch).
- /a gains related-grouping chips + a Copy-link/share control.
- Tighten the summary prompt: original, factual, no quotations / no close paraphrase.
Long tail stays lazy+cached. No article bodies stored. 129 tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-04 19:48:32 +00:00
thejayman77 ab5caada0b Fix summary LLM call: use raw chat text, not classifier-JSON parsing
client._chat() JSON-parses every response (for the classifier), so the plain-text
summary was rejected ("model did not return JSON") even though the model returned
a perfect summary. Split out _raw_content() and add chat_text() for free-form
output; summaries use it. _chat keeps parsing for classification.
2026-06-03 18:12:20 +00:00
thejayman77 1d71575982 Share pages: lazy, cached, our-own-words article summaries
The /a/<id> page now carries an original short summary so it stands on its own,
without republishing the publisher's article:
- summarize.py: transient SSRF-guarded fetch of the article text → local LLM
  writes a 2-4 sentence ORIGINAL summary (our words). Cached in article_summaries
  forever; we store only our summary, never the body. Generated lazily (only for
  shared/viewed articles), de-duped so concurrent hits don't double-generate.
- /a serves cached-or-pending; when pending it shows a calm "summary on its way,
  read at {source}" note and self-polls /api/summary/<id>, swapping the summary
  in the moment it's ready (never blocks the page on the batch-tier LLM).
- Share menu warms generation on open so recipients usually get the rich version.
- Container reaches the arbiter at arbiter:8080 over caddy_web (LLM env added to
  the API container). 124 tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 18:08:40 +00:00