Per the logo + brand: the name is upbeatBytes (camelCase). Swept all user-facing
strings — titles/og:site_name/og:title, logo alt text, share pages (share.py),
emails (email_send), classifier prompt (llm), digest/unsubscribe (api), PWA
manifest, game share text, sign-in, the SPA shell + patch-static-heads (play
title) — plus README/publish.sh and the email test fixture. (SMTP From env was
already upbeatBytes.) Domains (upbeatbytes.com) unchanged. 425 BE + 36 FE green.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The deploy pipeline runs from the working tree, so a wave of shipped features
had never been committed. This snapshots git to what's actually running.
SEO impression recovery (live + verified):
- Duplicate /a/{id} now 301-redirect to their canonical twin instead of 404
(a hard 404 silently dropped already-indexed URLs and tanked impressions).
- Dedup representative selection reworked: accepted/serveable -> established
rep (URL stability) -> quality score, so an accepted page never retires to a
rejected rep and an indexed canonical doesn't churn when a newer twin arrives.
- HEAD /a/{id} returns the same status as GET (api_route GET+HEAD) instead of
falling through to the static mount and 404ing.
- `dedup --force-recluster`: cycle-locked, model-free re-cluster to re-apply the
policy to the existing corpus (shared cycle_lock context manager).
- CLI honors GOODNEWS_DB for its default --db (was silently ignored).
Publishing Desk (admin tool to post highlights to X via Web Intents):
- publishing.py queue/rank/handle-resolution; admin UI; full searchable emoji
picker (bundled data, no CDN) for the blurb editor.
Play games + site:
- Bloom (word-wheel), Memory Match, daily ritual set, Zen Den (dev-gated).
- English-only language gate; source prospecting; paywall + dedup hardening.
Tests: full suite green (349). Ignores tightened (node_modules, data/*.db).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Codex review: the body-horror boundary was directionally right but a hair too
broad — black-hole/cosmology, lunar-regolith engineering hazards, and a
microplastics measurement-methodology piece were rejected on dramatic vocabulary
alone (cortisol 4–6). Add scoring guidance: score cortisol by the reader's
personal/visceral/public-health threat, not by dramatic words or subject
grandeur. Distant astronomy, equipment hazards, geological forces, scientific
self-correction, natural-history mechanisms, predator–prey biology, and
historical discoveries are LOW cortisol (0–3) even when worded "deadly"/"lethal".
Reserve high cortisol for disease, contamination, outbreak, parasites, violence,
or immediate suffering.
Verified: black hole / moon / microplastics now accept (cortisol 1–2);
parasite (8), Ebola (6), hantavirus outbreak (6) still reject.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The flesh-eating-parasite story slipped through as "calm public-health
monitoring" — the gate had no body-horror class and let "informative/public
health" rescue a viscerally alarming subject. Two fixes:
* Reject visceral-threat hooks (outbreaks, parasites, infestations,
contamination, recalls, poisonings, "flesh-eating" infections) even when
calmly framed as monitoring/surveillance/awareness/public health — judge the
reader's gut, not the prose. Keep genuine health wins (treatments, recovery,
prevention, wellbeing): the line is the hook, not the topic.
* A high cortisol_score is disqualifying on its own — anxiety outweighs how
informative or constructive a piece is.
Verified: 3 flesh-eating-parasite variants now REJECT (cortisol 8) while calm
health/wellness (diabetes treatment, sleep tips, green-space study) still pass.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Tighten the gate's AI handling per review: accept practical/beneficial/creative/
scientific/humane/bounded AI stories; reject AI framed around loss of control,
cognitive decline, job/surveillance/existential panic, child/social-harm panic,
"falling behind" productivity anxiety, or arms-race. Verified: MIT TR now rejects
"lose control of our brains" + "flood of AI lawsuits" (both previously accepted).
Shift the acceptance bar from "must be uplifting" to "will a reader finish this
calm or a little better, never worse." Keep neutral-but-absorbing (discoveries,
explainers, clever builds, useful insight), and reject anxiety-inducing content —
especially the comparison traps (inferior/behind/FOMO/hustle/status). Scores still
back the verdict. Lets us pull from mainstream sources and filter, rather than
relying on niche good-news outlets.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
client._chat() JSON-parses every response (for the classifier), so the plain-text
summary was rejected ("model did not return JSON") even though the model returned
a perfect summary. Split out _raw_content() and add chat_text() for free-form
output; summaries use it. _chat keeps parsing for classification.
Three-layer organization: primary topic (one per article, for ranking and
brief balance) + grouping tags (1-4 per article from a controlled vocabulary,
the organic "wandering" axis) + tonal flavor.
- taxonomy: add technology + learning topics; 4 calm tag families
(Discovery & Wonder, People & Kindness, Solutions & Progress, Mind & Craft)
defined in code, not the DB; ALLOWED_TAGS union + coerce_tags validation.
- db: article_tags(article_id, tag) join table + tag index.
- llm: tags added to the classifier json_schema (enum-constrained, maxItems 4)
and system prompt; normalize_scores coerces tags; upsert_article_score
replaces a row's tags atomically on every (re)classification.
- queries: feed gains a tag filter and exposes tags via group_concat; tag_counts.
- api: Article.tags, feed tag param, and /api/families with per-tag counts.
- tests: coerce/normalize/upsert/tag-filter/reclassify-replace/tag_counts +
/api/families. 99 passing.
Corpus reclassify (re-tag + new primary topics) runs separately against the
local LLM. Frontend (B2) pairs with this; the live site is unchanged until then.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Add pytest suite (34 tests) covering scoring thresholds, dedup clustering +
representative selection + time window, brief source/category diversity,
avoid-term phrase matching, and text canonicalization/truncation.
- Rewrite _select_diverse with an explicit, tested contract (best-first, one
per source, backfill, then inject a second category by evicting the
lowest-ranked pick).
- classify_articles now returns attempted/succeeded/skipped (ClassifyReport) so
silent model failures are visible in both the cycle and classify output.
- Fix clean_text truncation to stay within max_len (ellipsis no longer
overshoots).
- New filters.py: canonical FilterPrefs shape (include/mute topics+flavors,
avoid_terms, pauses) and pure word/phrase-boundary matching engine seeding
Calm Filters. Not yet wired into the API.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- cycle now prints per-article classify progress (flushed) so the long step is
clearly alive rather than appearing hung.
- An exclusive flock guards the cycle so a manual run and the systemd timer (or
two timer ticks) cannot overlap and contend on the database and model.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- LocalModelClient.embed() calls the OpenAI-compatible /embeddings endpoint
(local nomic model); base_url shared with chat, model via GOODNEWS_EMBED_MODEL.
- New article_embeddings table and articles.duplicate_of column (+ migration).
- dedup module: embeds missing articles, clusters near-identical stories within
a date window by cosine similarity (pure-stdlib, vectors normalised once), and
marks all but the highest-ranked member of each cluster as a duplicate.
- 'dedup' CLI command; cycle now runs poll -> classify -> dedup -> brief.
- Feed and brief queries hide duplicates, so a story carried by multiple
outlets shows once.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- poll_due_sources(): polls only sources whose last successful poll is older
than their poll_interval_minutes (or never polled), finally giving that
config field meaning.
- classify gains only_unclassified to spend the LLM solely on new (heuristic)
articles, so a frequent scheduled run stays cheap.
- 'cycle' command runs poll-due -> classify-new -> rebuild-today's-brief, with
each step non-fatal so a down model endpoint or empty day never aborts it.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- New taxonomy module: single source of truth for 6 topics x 5 flavors,
shared by the LLM response schema (enum-constrained) and validation.
- Classifier now assigns one topic + one flavor per article; json_schema
enums force valid values, with coercion as a safety net.
- article_scores gains topic/flavor columns via an idempotent migration.
- New 'list-category' command to browse by topic and/or flavor, ranked by
composite score.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Use json_schema structured output (newer LM Studio rejects json_object),
escalating through json_schema -> json_object -> text and pinning the
first format the server accepts to avoid wasted round-trips.
- Make per-article failures non-fatal and commit incrementally so a single
timeout no longer discards the whole batch.
- Raise default timeout to 180s (configurable via GOODNEWS_LLM_TIMEOUT) for
larger local reasoning models.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>