# goodNews Local-first constructive news ingestion prototype. The first milestone is intentionally small: collect public RSS/Atom metadata, dedupe it, store short source-provided snippets, and attach early reason-coded heuristic scores. It does not store full article bodies. ## Commands From this directory: ```bash python3 -m goodnews init-db python3 -m goodnews import-sources python3 -m goodnews poll --limit 3 python3 -m goodnews rescore python3 -m goodnews check-llm --base-url http://127.0.0.1:1234/v1 --model gpt-oss python3 -m goodnews classify --limit 10 --base-url http://127.0.0.1:1234/v1 --model gpt-oss python3 -m goodnews build-brief --date 2026-05-27 --replace python3 -m goodnews show-brief python3 -m goodnews list-recent --limit 10 python3 -m goodnews list-recent --accepted-only --limit 10 python3 -m goodnews list-category --topic animals --flavor discovery python3 -m goodnews list-category --topic environment --flavor solution python3 -m goodnews source-report python3 -m goodnews list-runs ``` The SQLite database lives at: ```txt data/goodnews.sqlite3 ``` Sources live at: ```txt config/sources.toml ``` ## Categories When classified by the local model, each article is tagged with one **topic** and one **flavor**, allowing browsable category feeds (e.g. "feel-good animals", "environment solutions") via `list-category`: - **Topics:** science, environment, health, community, culture, animals - **Flavors:** breakthrough, discovery, solution, feelgood, perspective The allowed values live in `goodnews/taxonomy.py`. The accept/reject gate is kept deliberately broad ("not dreary"); ranking and category filters do the curation. ## Stored Article Data For each article, the database stores: - source - canonical URL - title - short RSS/Atom description or summary - author, if present - published timestamp, if present - image URL, if present - language, if present - hashes used for dedupe - heuristic scores and reason codes ## Web / API The optional `web` extra adds a FastAPI service and a small static site that consumes it. The same JSON API backs both the website and any future companion app; its auto-generated OpenAPI docs at `/docs` are the shared contract. ```bash pip install -e '.[web]' # or: .venv/bin/pip install -e '.[web]' python3 -m goodnews serve # http://127.0.0.1:8000 python3 -m goodnews serve --host 0.0.0.0 # expose on the network ``` Endpoints: - `GET /` — the static site (daily five + topic/flavor browsing) - `GET /healthz` — liveness + scored-article count - `GET /api/categories` — the topic/flavor taxonomy - `GET /api/category-counts` — article counts per topic/flavor - `GET /api/feed?topic=&flavor=&limit=&offset=` — ranked, filtered articles - `GET /api/brief?date=&limit=` — a daily brief (latest if no date) - `GET /api/brief-dates` — available brief dates - `GET /docs` — interactive OpenAPI documentation The ingestion CLI stays pure-stdlib; only the `web` extra pulls in FastAPI/uvicorn, so the two halves can be deployed and upgraded independently. ## Deployment The database is never baked into the image — the API and the ingestion CLI share one SQLite file via a mounted volume. Run ingestion (`poll`, `classify`, `build-brief`) on a schedule against the same file. ```bash docker build -t goodnews . docker run -p 8000:8000 -v /srv/goodnews/data:/data goodnews ``` `GOODNEWS_DB` controls the database path (defaults to `data/goodnews.sqlite3`). Put a reverse proxy (Caddy/nginx) in front for TLS once a domain is attached. ## Next Steps 1. Run the poller for a few days and inspect which sources produce useful candidates. 2. Add source-level quality notes and deactivate noisy feeds. 3. Replace or supplement `heuristic-v0` with a local model classifier. 4. Add a daily brief builder that selects 5 items using scores and source diversity. 5. Add a small web/API layer once the ingest data looks trustworthy. ## Local Model Configuration The `classify` command expects an OpenAI-compatible local chat-completions server. You can pass settings directly: ```bash python3 -m goodnews classify --base-url http://127.0.0.1:1234/v1 --model gpt-oss --limit 10 ``` Or use environment variables: ```bash export GOODNEWS_LLM_BASE_URL=http://127.0.0.1:1234/v1 export GOODNEWS_LLM_MODEL=gpt-oss python3 -m goodnews classify --limit 10 ``` `classify` rewrites the current score/reason row for selected candidates. `rescore` can restore the fast heuristic scores.