upbeatBytes

Author	SHA1	Message	Date
thejayman77	667b1a82c3	brand: standardize "Upbeat Bytes" → "upbeatBytes" everywhere Per the logo + brand: the name is upbeatBytes (camelCase). Swept all user-facing strings — titles/og:site_name/og:title, logo alt text, share pages (share.py), emails (email_send), classifier prompt (llm), digest/unsubscribe (api), PWA manifest, game share text, sign-in, the SPA shell + patch-static-heads (play title) — plus README/publish.sh and the email test fixture. (SMTP From env was already upbeatBytes.) Domains (upbeatbytes.com) unchanged. 425 BE + 36 FE green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-28 20:01:20 -04:00
thejayman77	89c0fbe1f6	Sync repo to deployed state: SEO recovery, Publishing Desk, Play games, emoji picker The deploy pipeline runs from the working tree, so a wave of shipped features had never been committed. This snapshots git to what's actually running. SEO impression recovery (live + verified): - Duplicate /a/{id} now 301-redirect to their canonical twin instead of 404 (a hard 404 silently dropped already-indexed URLs and tanked impressions). - Dedup representative selection reworked: accepted/serveable -> established rep (URL stability) -> quality score, so an accepted page never retires to a rejected rep and an indexed canonical doesn't churn when a newer twin arrives. - HEAD /a/{id} returns the same status as GET (api_route GET+HEAD) instead of falling through to the static mount and 404ing. - `dedup --force-recluster`: cycle-locked, model-free re-cluster to re-apply the policy to the existing corpus (shared cycle_lock context manager). - CLI honors GOODNEWS_DB for its default --db (was silently ignored). Publishing Desk (admin tool to post highlights to X via Web Intents): - publishing.py queue/rank/handle-resolution; admin UI; full searchable emoji picker (bundled data, no CDN) for the blurb editor. Play games + site: - Bloom (word-wheel), Memory Match, daily ritual set, Zen Den (dev-gated). - English-only language gate; source prospecting; paywall + dedup hardening. Tests: full suite green (349). Ignores tightened (node_modules, data/*.db). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-18 11:32:27 -04:00
thejayman77	9813af40ed	Classifier: don't over-score cortisol for abstract/distant science Codex review: the body-horror boundary was directionally right but a hair too broad — black-hole/cosmology, lunar-regolith engineering hazards, and a microplastics measurement-methodology piece were rejected on dramatic vocabulary alone (cortisol 4–6). Add scoring guidance: score cortisol by the reader's personal/visceral/public-health threat, not by dramatic words or subject grandeur. Distant astronomy, equipment hazards, geological forces, scientific self-correction, natural-history mechanisms, predator–prey biology, and historical discoveries are LOW cortisol (0–3) even when worded "deadly"/"lethal". Reserve high cortisol for disease, contamination, outbreak, parasites, violence, or immediate suffering. Verified: black hole / moon / microplastics now accept (cortisol 1–2); parasite (8), Ebola (6), hantavirus outbreak (6) still reject. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-07 12:06:18 -04:00
thejayman77	e7610d2889	Classifier: reject body-horror / disease-threat; anxiety outweighs informative The flesh-eating-parasite story slipped through as "calm public-health monitoring" — the gate had no body-horror class and let "informative/public health" rescue a viscerally alarming subject. Two fixes: * Reject visceral-threat hooks (outbreaks, parasites, infestations, contamination, recalls, poisonings, "flesh-eating" infections) even when calmly framed as monitoring/surveillance/awareness/public health — judge the reader's gut, not the prose. Keep genuine health wins (treatments, recovery, prevention, wellbeing): the line is the hook, not the topic. * A high cortisol_score is disqualifying on its own — anxiety outweighs how informative or constructive a piece is. Verified: 3 flesh-eating-parasite variants now REJECT (cortisol 8) while calm health/wellness (diabetes treatment, sleep tips, green-space study) still pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-07 11:42:17 -04:00
thejayman77	8653a46fd4	Classifier: explicit "no AI dread" boundary Tighten the gate's AI handling per review: accept practical/beneficial/creative/ scientific/humane/bounded AI stories; reject AI framed around loss of control, cognitive decline, job/surveillance/existential panic, child/social-harm panic, "falling behind" productivity anxiety, or arms-race. Verified: MIT TR now rejects "lose control of our brains" + "flood of AI lawsuits" (both previously accepted).	2026-06-06 14:07:31 +00:00
thejayman77	a36b1a098e	Retune classifier gate: calm/non-anxiety, absorbing-allowed Shift the acceptance bar from "must be uplifting" to "will a reader finish this calm or a little better, never worse." Keep neutral-but-absorbing (discoveries, explainers, clever builds, useful insight), and reject anxiety-inducing content — especially the comparison traps (inferior/behind/FOMO/hustle/status). Scores still back the verdict. Lets us pull from mainstream sources and filter, rather than relying on niche good-news outlets. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 02:03:24 +00:00
thejayman77	ab5caada0b	Fix summary LLM call: use raw chat text, not classifier-JSON parsing client._chat() JSON-parses every response (for the classifier), so the plain-text summary was rejected ("model did not return JSON") even though the model returned a perfect summary. Split out _raw_content() and add chat_text() for free-form output; summaries use it. _chat keeps parsing for classification.	2026-06-03 18:12:20 +00:00
thejayman77	a47a1504c8	Phase B1: multi-tag groupings model (backend) Three-layer organization: primary topic (one per article, for ranking and brief balance) + grouping tags (1-4 per article from a controlled vocabulary, the organic "wandering" axis) + tonal flavor. - taxonomy: add technology + learning topics; 4 calm tag families (Discovery & Wonder, People & Kindness, Solutions & Progress, Mind & Craft) defined in code, not the DB; ALLOWED_TAGS union + coerce_tags validation. - db: article_tags(article_id, tag) join table + tag index. - llm: tags added to the classifier json_schema (enum-constrained, maxItems 4) and system prompt; normalize_scores coerces tags; upsert_article_score replaces a row's tags atomically on every (re)classification. - queries: feed gains a tag filter and exposes tags via group_concat; tag_counts. - api: Article.tags, feed tag param, and /api/families with per-tag counts. - tests: coerce/normalize/upsert/tag-filter/reclassify-replace/tag_counts + /api/families. 99 passing. Corpus reclassify (re-tag + new primary topics) runs separately against the local LLM. Frontend (B2) pairs with this; the live site is unchanged until then. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-01 18:35:25 +00:00
thejayman77	9cdcda5e02	Durability pass: tests, clearer diversity/classify behavior, Calm Filters foundation - Add pytest suite (34 tests) covering scoring thresholds, dedup clustering + representative selection + time window, brief source/category diversity, avoid-term phrase matching, and text canonicalization/truncation. - Rewrite _select_diverse with an explicit, tested contract (best-first, one per source, backfill, then inject a second category by evicting the lowest-ranked pick). - classify_articles now returns attempted/succeeded/skipped (ClassifyReport) so silent model failures are visible in both the cycle and classify output. - Fix clean_text truncation to stay within max_len (ellipsis no longer overshoots). - New filters.py: canonical FilterPrefs shape (include/mute topics+flavors, avoid_terms, pauses) and pure word/phrase-boundary matching engine seeding Calm Filters. Not yet wired into the API. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 19:07:31 +00:00
thejayman77	470e9ecbf8	Make cycle show classify progress and prevent overlapping runs - cycle now prints per-article classify progress (flushed) so the long step is clearly alive rather than appearing hung. - An exclusive flock guards the cycle so a manual run and the systemd timer (or two timer ticks) cannot overlap and contend on the database and model. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 16:15:03 +00:00
thejayman77	5d44072fca	Add semantic cross-source dedup via local embeddings - LocalModelClient.embed() calls the OpenAI-compatible /embeddings endpoint (local nomic model); base_url shared with chat, model via GOODNEWS_EMBED_MODEL. - New article_embeddings table and articles.duplicate_of column (+ migration). - dedup module: embeds missing articles, clusters near-identical stories within a date window by cosine similarity (pure-stdlib, vectors normalised once), and marks all but the highest-ranked member of each cluster as a duplicate. - 'dedup' CLI command; cycle now runs poll -> classify -> dedup -> brief. - Feed and brief queries hide duplicates, so a story carried by multiple outlets shows once. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 15:40:55 +00:00
thejayman77	2414fd3ccb	Add interval-aware polling and a 'cycle' command for scheduling - poll_due_sources(): polls only sources whose last successful poll is older than their poll_interval_minutes (or never polled), finally giving that config field meaning. - classify gains only_unclassified to spend the LLM solely on new (heuristic) articles, so a frequent scheduled run stays cheap. - 'cycle' command runs poll-due -> classify-new -> rebuild-today's-brief, with each step non-fatal so a down model endpoint or empty day never aborts it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 14:13:00 +00:00
thejayman77	38057d0354	Add topic/flavor categorization and category browsing - New taxonomy module: single source of truth for 6 topics x 5 flavors, shared by the LLM response schema (enum-constrained) and validation. - Classifier now assigns one topic + one flavor per article; json_schema enums force valid values, with coercion as a safety net. - article_scores gains topic/flavor columns via an idempotent migration. - New 'list-category' command to browse by topic and/or flavor, ranked by composite score. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 11:21:53 +00:00
thejayman77	f4842ed100	Fix LLM classify for newer OpenAI-compatible servers - Use json_schema structured output (newer LM Studio rejects json_object), escalating through json_schema -> json_object -> text and pinning the first format the server accepts to avoid wasted round-trips. - Make per-article failures non-fatal and commit incrementally so a single timeout no longer discards the whole batch. - Raise default timeout to 180s (configurable via GOODNEWS_LLM_TIMEOUT) for larger local reasoning models. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 01:21:05 +00:00
thejayman77	068073423f	Initial commit: goodNews constructive-news ingestion prototype Local-first RSS/Atom ingestion pipeline with metadata-only storage, heuristic + local-LLM scoring, and daily brief builder. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 00:48:26 +00:00

15 Commits