5 Commits

Author SHA1 Message Date
thejayman77 acbc06a9e5 Use BBC's clean image variant (cpsprodpb) instead of the branded one
BBC's og:image comes from the "branded_news" CDN path with a "BBC NEWS" logo
baked into the picture (shows as "…EWS" once the hero crops it). The identical
photo is served under "cpsprodpb" with no logo, so rewrite branded_news →
cpsprodpb. Best of both: full-resolution hero, no burned-in branding. Re-enriched
recent briefs so live images swap over. 99 tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 07:51:51 +00:00
thejayman77 2145622b59 Stop rejecting BBC's branded_news images (the blurry-hero bug)
og:image extraction rejected any URL containing "branded_news" as a generic share
image, but that's BBC's normal CDN path for real article photos. So every BBC hero
fell back to the 240px RSS thumbnail (blurry when shown large). Drop that marker;
keep the genuine placeholder markers (facebook-default, og-default, etc.). Updated
the test to assert BBC branded_news paths pass through. 99 tests pass.

(One-time: cleared image_checked_at on the 57 previously-checked articles and
re-enriched recent briefs so existing thumbnails upgrade to og:images.)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 07:47:08 +00:00
thejayman77 7e1dfd5b3c Reject branded/generic share images; hero prefers a clean illustrated story
- og:image enrichment now skips branded/generic share images (BBC 'branded_news'
  with its burned-in logo, NPR 'facebook-default', etc.) and keeps the first
  real article image — so no competitor logo lands on our hero. Cleared the few
  already-stored branded URLs so they re-enrich.
- Hero selection now prefers a gentle + readable story that also HAS a (clean)
  image, falling back to gentle-readable, then gentle. The lead is visual when
  possible, typographic otherwise — never branded.

(The '7 cards' report was a stale browser cache: the brief stores 7 and the
built JS requests 7; a hard refresh shows all seven.)

Tests: branded/generic image rejection (87 total).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 13:03:20 +00:00
thejayman77 d8d665ee35 Crisp hero (prefer og:image), 7-card Highlights, no-recycle Replace + session History
- Hero blur fix: brief enrichment now prefers a page's og:image even when a
  feed thumbnail exists (feed thumbs are often tiny; the hero is shown large).
  Verified: BBC hero upgrades to the 1024px share image, ScienceDaily to 1920px.
- Today is now 'Highlights from Today' — hero + 6 (brief size 7), which also
  makes the secondary grid a balanced 3+3 instead of an orphaned 3+1.
- Replace now excludes every article seen this session (a client-side seen-set),
  so it never cycles back to something already shown.
- New session History panel (this tab only, no account): lists everything seen,
  including swapped-away stories, so they stay recoverable. Persistent
  history/favorites are tabled for sign-in later.

Tests: og:image upgrade of an existing feed image (86 total).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 12:56:57 +00:00
thejayman77 9e8eddf46d Bounded hero-image enrichment (og:image for brief items only)
The grid stays typographic; the hero is the one intentional visual slot. At
brief-build time we fetch a hero-quality image for the daily five that lack one:
- enrich.py reads ONLY a page's <head> og:image/twitter:image and stores just
  the URL (never the body).
- SSRF-guarded: http(s) only, 6s timeout, 300KB cap, <=3 manual redirects each
  re-validated, and hosts rejected if any resolved address is private, loopback,
  link-local, multicast, reserved, or unspecified.
- image_checked_at column caches success AND failure, so an article is never
  retried forever.
- Wired into build-brief and cycle (brief items only, only if image missing and
  unchecked). Everything else stays metadata-only.
- Verified live: today's five all carry images (feed + enriched).

Tests: og:image parser, head-only scope, IP guard across internal ranges, and
enrich success + failure-caching (85 total).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 12:37:41 +00:00