- deploy/goodnews.service: oneshot unit running 'goodnews cycle' with a generous TimeoutStartSec so long classify runs are not killed. - deploy/goodnews.timer: every 15 min, Persistent=true to catch missed runs. - deploy/goodnews.env.example: LLM endpoint + DB path for the scheduled run. - README: scheduling/install docs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
goodNews
Local-first constructive news ingestion prototype.
The first milestone is intentionally small: collect public RSS/Atom metadata, dedupe it, store short source-provided snippets, and attach early reason-coded heuristic scores. It does not store full article bodies.
Commands
From this directory:
python3 -m goodnews init-db
python3 -m goodnews import-sources
python3 -m goodnews poll --limit 3
python3 -m goodnews rescore
python3 -m goodnews check-llm --base-url http://127.0.0.1:1234/v1 --model gpt-oss
python3 -m goodnews classify --limit 10 --base-url http://127.0.0.1:1234/v1 --model gpt-oss
python3 -m goodnews build-brief --date 2026-05-27 --replace
python3 -m goodnews show-brief
python3 -m goodnews list-recent --limit 10
python3 -m goodnews list-recent --accepted-only --limit 10
python3 -m goodnews list-category --topic animals --flavor discovery
python3 -m goodnews list-category --topic environment --flavor solution
python3 -m goodnews source-report
python3 -m goodnews list-runs
The SQLite database lives at:
data/goodnews.sqlite3
Sources live at:
config/sources.toml
Categories
When classified by the local model, each article is tagged with one topic
and one flavor, allowing browsable category feeds (e.g. "feel-good animals",
"environment solutions") via list-category:
- Topics: science, environment, health, community, culture, animals
- Flavors: breakthrough, discovery, solution, feelgood, perspective
The allowed values live in goodnews/taxonomy.py. The accept/reject gate is kept
deliberately broad ("not dreary"); ranking and category filters do the curation.
Stored Article Data
For each article, the database stores:
- source
- canonical URL
- title
- short RSS/Atom description or summary
- author, if present
- published timestamp, if present
- image URL, if present
- language, if present
- hashes used for dedupe
- heuristic scores and reason codes
Web / API
The optional web extra adds a FastAPI service and a small static site that
consumes it. The same JSON API backs both the website and any future companion
app; its auto-generated OpenAPI docs at /docs are the shared contract.
pip install -e '.[web]' # or: .venv/bin/pip install -e '.[web]'
python3 -m goodnews serve # http://127.0.0.1:8000
python3 -m goodnews serve --host 0.0.0.0 # expose on the network
Endpoints:
GET /— the static site (daily five + topic/flavor browsing)GET /healthz— liveness + scored-article countGET /api/categories— the topic/flavor taxonomyGET /api/category-counts— article counts per topic/flavorGET /api/feed?topic=&flavor=&limit=&offset=— ranked, filtered articlesGET /api/brief?date=&limit=— a daily brief (latest if no date)GET /api/brief-dates— available brief datesGET /docs— interactive OpenAPI documentation
The ingestion CLI stays pure-stdlib; only the web extra pulls in FastAPI/uvicorn,
so the two halves can be deployed and upgraded independently.
Deployment
The database is never baked into the image — the API and the ingestion CLI share
one SQLite file via a mounted volume. Run ingestion (poll, classify,
build-brief) on a schedule against the same file.
docker build -t goodnews .
docker run -p 8000:8000 -v /srv/goodnews/data:/data goodnews
GOODNEWS_DB controls the database path (defaults to data/goodnews.sqlite3).
Put a reverse proxy (Caddy/nginx) in front for TLS once a domain is attached.
Scheduling
A single idempotent command runs the whole pipeline and is safe to invoke as
often as you like — it only polls sources that are due (per each source's
poll_interval_minutes), only classifies articles the model hasn't seen, and
rebuilds the current day's brief:
python3 -m goodnews cycle # poll due -> classify new -> rebuild today's brief
python3 -m goodnews cycle --force # poll every active source regardless of interval
python3 -m goodnews cycle --no-classify # skip the LLM step (e.g. model box offline)
A systemd timer runs it every 15 minutes. Unit files live in deploy/:
sudo install -d /etc/goodnews
sudo install -m 644 deploy/goodnews.env.example /etc/goodnews/goodnews.env # then edit
sudo install -m 644 deploy/goodnews.service deploy/goodnews.timer /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now goodnews.timer
systemctl list-timers goodnews.timer # when it next runs
journalctl -u goodnews.service -f # watch cycle output
/etc/goodnews/goodnews.env supplies GOODNEWS_LLM_BASE_URL, GOODNEWS_LLM_MODEL,
and GOODNEWS_DB to the scheduled run. The timer uses Persistent=true, so a
run missed while the machine was off is caught up on the next boot.
Next Steps
- Run the poller for a few days and inspect which sources produce useful candidates.
- Add source-level quality notes and deactivate noisy feeds.
- Replace or supplement
heuristic-v0with a local model classifier. - Add a daily brief builder that selects 5 items using scores and source diversity.
- Add a small web/API layer once the ingest data looks trustworthy.
Local Model Configuration
The classify command expects an OpenAI-compatible local chat-completions server.
You can pass settings directly:
python3 -m goodnews classify --base-url http://127.0.0.1:1234/v1 --model gpt-oss --limit 10
Or use environment variables:
export GOODNEWS_LLM_BASE_URL=http://127.0.0.1:1234/v1
export GOODNEWS_LLM_MODEL=gpt-oss
python3 -m goodnews classify --limit 10
classify rewrites the current score/reason row for selected candidates. rescore can restore the fast heuristic scores.