thejayman77 ba0838dd94 home3: rename the small-joys OTD card "A good thing today" → "On this day"
Names the ritual (this date in history) rather than describing it; matches the
/onthisday page + engine. Hero tag becomes "{year} in history" (the old
"ON THIS DAY" there was now redundant with the eyebrow).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 11:54:41 -04:00

Upbeat Bytes

Calm, constructive news — local-first ingestion, scoring, and a daily brief. (The Python package and CLI are named goodnews for historical reasons; the product is Upbeat Bytes, at upbeatbytes.com.)

The first milestone is intentionally small: collect public RSS/Atom metadata, dedupe it, store short source-provided snippets, and attach early reason-coded heuristic scores. It does not store full article bodies.

Commands

From this directory:

python3 -m goodnews init-db
python3 -m goodnews import-sources
python3 -m goodnews poll --limit 3
python3 -m goodnews rescore
python3 -m goodnews check-llm --base-url http://127.0.0.1:1234/v1 --model gpt-oss
python3 -m goodnews classify --limit 10 --base-url http://127.0.0.1:1234/v1 --model gpt-oss
python3 -m goodnews dedup --base-url http://127.0.0.1:1234/v1
python3 -m goodnews check-feeds
python3 -m goodnews preview-source https://example.com/feed/ --classify
python3 -m goodnews suggest-source https://example.com/feed/ --name "Example" --classify
python3 -m goodnews list-candidates
python3 -m goodnews promote-candidate 1        # copies into sources (inactive by default)
python3 -m goodnews reject-candidate 1
python3 -m goodnews review-sources             # advisory health flags (never deactivates)
python3 -m goodnews build-brief --date 2026-05-27 --replace
python3 -m goodnews show-brief
python3 -m goodnews list-recent --limit 10
python3 -m goodnews list-recent --accepted-only --limit 10
python3 -m goodnews list-category --topic animals --flavor discovery
python3 -m goodnews list-category --topic environment --flavor solution
python3 -m goodnews source-report
python3 -m goodnews list-runs

The SQLite database lives at:

data/goodnews.sqlite3

Sources live at:

config/sources.toml

Categories

When classified by the local model, each article is tagged with one topic and one flavor, allowing browsable category feeds (e.g. "feel-good animals", "environment solutions") via list-category:

  • Topics: science, environment, health, community, culture, animals
  • Flavors: breakthrough, discovery, solution, feelgood, perspective

The allowed values live in goodnews/taxonomy.py. The accept/reject gate is kept deliberately broad ("not dreary"); ranking and category filters do the curation.

Deduplication

Two layers:

  • Exact: a URL hash UNIQUE constraint drops the literal same link at ingest.
  • Semantic: dedup embeds each article's title+snippet with the local embedding model, clusters near-identical stories within a few-day window (cosine similarity), and marks all but the highest-ranked in each cluster as duplicate_of the representative. Feed and brief queries hide duplicates, so the same story carried by several outlets appears once. This runs as part of cycle, so the scheduler keeps the corpus deduped automatically.

Stored Article Data

For each article, the database stores:

  • source
  • canonical URL
  • title
  • short RSS/Atom description or summary
  • author, if present
  • published timestamp, if present
  • image URL, if present
  • language, if present
  • hashes used for dedupe
  • heuristic scores and reason codes

Web / API

The optional web extra adds a FastAPI service and a small static site that consumes it. The same JSON API backs both the website and any future companion app; its auto-generated OpenAPI docs at /docs are the shared contract.

pip install -e '.[web]'          # or: .venv/bin/pip install -e '.[web]'
python3 -m goodnews serve                  # http://127.0.0.1:8000
python3 -m goodnews serve --host 0.0.0.0   # expose on the network

Endpoints:

  • GET / — the static site (daily five + topic/flavor browsing)
  • GET /healthz — liveness + scored-article count
  • GET /api/categories — the topic/flavor taxonomy
  • GET /api/moods — mood modes (the humane front door: Today, Wonder, People Helping, Solutions, Light Only, Grounded)
  • GET /api/category-counts — article counts per topic/flavor
  • GET /api/feed?topic=&flavor=&limit=&offset= — ranked, filtered articles
  • GET /api/brief?date=&limit= — a daily brief (latest if no date)
  • GET /api/brief-dates — available brief dates
  • GET /api/source-preview?url=&classify= — read-only scored sample of a feed (vet before adding)
  • GET /api/candidates?status= — staged source candidates (read-only; curation is CLI-only for now)
  • GET /docs — interactive OpenAPI documentation

The ingestion CLI stays pure-stdlib; only the web extra pulls in FastAPI/uvicorn, so the two halves can be deployed and upgraded independently.

Development (hot-reload, no rebuild dance)

Two terminals, and you stop having to npm run build + restart + hard-refresh:

# 1. API with auto-reload on backend code changes
python3 -m goodnews serve --host 0.0.0.0 --port 8000 --reload

# 2. Vite dev server with hot module reload (proxies /api -> :8000)
cd frontend && npm run dev      # serves on http://<host>:5173

Visit :5173 while developing — frontend edits hot-reload instantly, and the --reload API restarts itself on backend edits. Only build (npm run build) + serve on :8000 when you want to check the production-served form.

Frontend

The site is a SvelteKit static SPA in frontend/ (calm editorial design, mood-mode navigation, the daily brief as a hero, browsable lanes, inline Calm Filters, PWA manifest). It consumes the JSON API above, so the website and a future companion app share one contract. Build it once and FastAPI serves the output:

cd frontend && npm install && npm run build   # -> frontend/build
cd .. && python3 -m goodnews serve             # serves frontend/build at /

If frontend/build is absent, the server falls back to the legacy single-page harness in goodnews/static/. The Docker image builds the frontend automatically (multi-stage), so deployment is just docker build.

The secondary grid is intentionally typographic (no thumbnails). The hero is the one image slot: at brief-build time, enrich.py fetches a hero-quality image for the daily five that lack one — reading only a page's <head> og:image / twitter:image and storing just the URL (never the body). It's SSRF-guarded (http(s) only, short timeout, byte cap, capped redirects, and rejects hosts that resolve to private/loopback/link-local/multicast addresses), and failures are cached so an article is never retried. Everywhere else stays metadata-only.

Calm Filters

Personal, device-local controls so a reader can stay informed without subjects they'd rather not see right now. Preferences live in the browser (localStorage), are sent to the read endpoints as a prefs JSON query param, and are applied identically to the feed, the brief, and the category counts so the numbers always match what's shown. The canonical shape (goodnews/filters.py):

{
  "include_topics": [], "include_flavors": [],
  "mute_topics": [], "mute_flavors": [],
  "avoid_terms": ["election", "stock market"],
  "pauses": [{"kind": "topic", "value": "health", "until": "2026-06-02T00:00:00Z"}]
}

The site surfaces a humane ladder rather than a settings panel of dread:

  • Not today → pause that article's topic for 24h.
  • Less like this → ease off that flavor for ~3 days.
  • Always hide … → a standing mute (undoable in the Calm filters panel).

Avoid-terms match whole words/phrases (case- and punctuation-insensitive, no substring surprises like "pan" matching "pandemic"). The brief is filtered down for MVP (no refill from outside the stored brief). No accounts; the same prefs object is the clean migration path to server-side, multi-user preferences later.

Deployment

The database is never baked into the image — the API and the ingestion CLI share one SQLite file via a mounted volume. Run ingestion (poll, classify, build-brief) on a schedule against the same file.

docker build -t goodnews .
docker run -p 8000:8000 -v /srv/goodnews/data:/data goodnews

GOODNEWS_DB controls the database path (defaults to data/goodnews.sqlite3). Put a reverse proxy (Caddy/nginx) in front for TLS once a domain is attached.

Production (upbeatbytes.com)

Live deployment splits the two halves:

  • Static site served by Caddy from /home/jay/srv/sites/upbeatbytes.
  • API runs as a read-only container (/home/jay/srv/upbeatbytes/compose.yaml) on Caddy's caddy_web network; Caddy proxies /api/*, /healthz, /docs to it. It mounts the host database (written by the ingestion timer) and only reads it.
  • Ingestion stays on the host goodnews.timer systemd unit.

Redeploy after changes with deploy/publish.sh (builds the frontend, syncs it to the live folder, rebuilds the API container, reloads Caddy).

Scheduling

A single idempotent command runs the whole pipeline and is safe to invoke as often as you like — it only polls sources that are due (per each source's poll_interval_minutes), only classifies articles the model hasn't seen, and rebuilds the current day's brief:

python3 -m goodnews cycle                 # poll due -> classify -> dedup -> brief -> review flags
python3 -m goodnews cycle --force         # poll every active source regardless of interval
python3 -m goodnews cycle --no-classify   # skip the LLM step (e.g. model box offline)

A systemd timer runs it every 15 minutes. Unit files live in deploy/:

sudo install -d /etc/goodnews
sudo install -m 644 deploy/goodnews.env.example /etc/goodnews/goodnews.env  # then edit
sudo install -m 644 deploy/goodnews.service deploy/goodnews.timer /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now goodnews.timer

systemctl list-timers goodnews.timer          # when it next runs
journalctl -u goodnews.service -f             # watch cycle output

/etc/goodnews/goodnews.env supplies GOODNEWS_LLM_BASE_URL, GOODNEWS_LLM_MODEL, and GOODNEWS_DB to the scheduled run. The timer uses Persistent=true, so a run missed while the machine was off is caught up on the next boot.

Next Steps

Done so far: RSS/Atom ingestion with exact + semantic dedup, heuristic + local-LLM classification with topic/flavor tagging, the daily brief, the FastAPI web/API layer and site, scheduled cycle via systemd, a pytest suite, and device-local Calm Filters.

Still ahead:

  1. Supervised source pipeline — preview + staging are done: suggest-source previews a feed and stages it in the source_candidates table (status suggested/quarantined/rejected/promoted); promote-candidate copies it into sources (inactive by default — active on approval); promotion is never automatic. Advisory health is done too: review-sources (also run at the end of cycle) flags stale, failing, low-acceptance, duplicate-heavy, or doom-skewed feeds for human review — it never deactivates anything. Still ahead: an authenticated POST surface so the website can accept public suggestions once accounts exist.
  2. Learned "Less like this" weighting — replace the interim flavor-pause with real preference down-ranking.
  3. Corpus rebalancing — add calm/feelgood sources (currently science-heavy).
  4. Retention/pruning — soft-delete + time-window indexes as the corpus grows toward ~10k articles (don't rush; not yet needed).
  5. Go-public hardening — TLS via a reverse proxy, then a domain.

Local Model Configuration

The classify command expects an OpenAI-compatible local chat-completions server.

You can pass settings directly:

python3 -m goodnews classify --base-url http://127.0.0.1:1234/v1 --model gpt-oss --limit 10

Or use environment variables:

export GOODNEWS_LLM_BASE_URL=http://127.0.0.1:1234/v1
export GOODNEWS_LLM_MODEL=gpt-oss
python3 -m goodnews classify --limit 10

classify rewrites the current score/reason row for selected candidates. rescore can restore the fast heuristic scores.

S
Description
No description provided
Readme 49 MiB
Languages
Python 59.1%
Svelte 31.7%
JavaScript 6.9%
HTML 1.8%
Shell 0.2%
Other 0.2%