2f4bdf2d00
- queries.py: shared read-only query helpers (feed, brief, category counts) returning plain dicts, used by the API and available to the CLI. - api.py: FastAPI service with Pydantic response models (the companion-app contract), CORS, and endpoints for categories, feed, brief, and health; mounts a static site at /. - static/index.html: minimal dependency-free site rendering the daily five and topic/flavor category browsing. - 'goodnews serve' command launches uvicorn (lazy import; core CLI stays pure-stdlib). Web deps live behind the optional [web] extra. - Dockerfile + .dockerignore + build-system metadata so the service installs and deploys cleanly, with the DB mounted as a shared volume. - README: web/API and deployment docs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
134 lines
4.4 KiB
Markdown
134 lines
4.4 KiB
Markdown
# goodNews
|
|
|
|
Local-first constructive news ingestion prototype.
|
|
|
|
The first milestone is intentionally small: collect public RSS/Atom metadata, dedupe it, store short source-provided snippets, and attach early reason-coded heuristic scores. It does not store full article bodies.
|
|
|
|
## Commands
|
|
|
|
From this directory:
|
|
|
|
```bash
|
|
python3 -m goodnews init-db
|
|
python3 -m goodnews import-sources
|
|
python3 -m goodnews poll --limit 3
|
|
python3 -m goodnews rescore
|
|
python3 -m goodnews check-llm --base-url http://127.0.0.1:1234/v1 --model gpt-oss
|
|
python3 -m goodnews classify --limit 10 --base-url http://127.0.0.1:1234/v1 --model gpt-oss
|
|
python3 -m goodnews build-brief --date 2026-05-27 --replace
|
|
python3 -m goodnews show-brief
|
|
python3 -m goodnews list-recent --limit 10
|
|
python3 -m goodnews list-recent --accepted-only --limit 10
|
|
python3 -m goodnews list-category --topic animals --flavor discovery
|
|
python3 -m goodnews list-category --topic environment --flavor solution
|
|
python3 -m goodnews source-report
|
|
python3 -m goodnews list-runs
|
|
```
|
|
|
|
The SQLite database lives at:
|
|
|
|
```txt
|
|
data/goodnews.sqlite3
|
|
```
|
|
|
|
Sources live at:
|
|
|
|
```txt
|
|
config/sources.toml
|
|
```
|
|
|
|
## Categories
|
|
|
|
When classified by the local model, each article is tagged with one **topic**
|
|
and one **flavor**, allowing browsable category feeds (e.g. "feel-good animals",
|
|
"environment solutions") via `list-category`:
|
|
|
|
- **Topics:** science, environment, health, community, culture, animals
|
|
- **Flavors:** breakthrough, discovery, solution, feelgood, perspective
|
|
|
|
The allowed values live in `goodnews/taxonomy.py`. The accept/reject gate is kept
|
|
deliberately broad ("not dreary"); ranking and category filters do the curation.
|
|
|
|
## Stored Article Data
|
|
|
|
For each article, the database stores:
|
|
|
|
- source
|
|
- canonical URL
|
|
- title
|
|
- short RSS/Atom description or summary
|
|
- author, if present
|
|
- published timestamp, if present
|
|
- image URL, if present
|
|
- language, if present
|
|
- hashes used for dedupe
|
|
- heuristic scores and reason codes
|
|
|
|
## Web / API
|
|
|
|
The optional `web` extra adds a FastAPI service and a small static site that
|
|
consumes it. The same JSON API backs both the website and any future companion
|
|
app; its auto-generated OpenAPI docs at `/docs` are the shared contract.
|
|
|
|
```bash
|
|
pip install -e '.[web]' # or: .venv/bin/pip install -e '.[web]'
|
|
python3 -m goodnews serve # http://127.0.0.1:8000
|
|
python3 -m goodnews serve --host 0.0.0.0 # expose on the network
|
|
```
|
|
|
|
Endpoints:
|
|
|
|
- `GET /` — the static site (daily five + topic/flavor browsing)
|
|
- `GET /healthz` — liveness + scored-article count
|
|
- `GET /api/categories` — the topic/flavor taxonomy
|
|
- `GET /api/category-counts` — article counts per topic/flavor
|
|
- `GET /api/feed?topic=&flavor=&limit=&offset=` — ranked, filtered articles
|
|
- `GET /api/brief?date=&limit=` — a daily brief (latest if no date)
|
|
- `GET /api/brief-dates` — available brief dates
|
|
- `GET /docs` — interactive OpenAPI documentation
|
|
|
|
The ingestion CLI stays pure-stdlib; only the `web` extra pulls in FastAPI/uvicorn,
|
|
so the two halves can be deployed and upgraded independently.
|
|
|
|
## Deployment
|
|
|
|
The database is never baked into the image — the API and the ingestion CLI share
|
|
one SQLite file via a mounted volume. Run ingestion (`poll`, `classify`,
|
|
`build-brief`) on a schedule against the same file.
|
|
|
|
```bash
|
|
docker build -t goodnews .
|
|
docker run -p 8000:8000 -v /srv/goodnews/data:/data goodnews
|
|
```
|
|
|
|
`GOODNEWS_DB` controls the database path (defaults to `data/goodnews.sqlite3`).
|
|
Put a reverse proxy (Caddy/nginx) in front for TLS once a domain is attached.
|
|
|
|
## Next Steps
|
|
|
|
1. Run the poller for a few days and inspect which sources produce useful candidates.
|
|
2. Add source-level quality notes and deactivate noisy feeds.
|
|
3. Replace or supplement `heuristic-v0` with a local model classifier.
|
|
4. Add a daily brief builder that selects 5 items using scores and source diversity.
|
|
5. Add a small web/API layer once the ingest data looks trustworthy.
|
|
|
|
## Local Model Configuration
|
|
|
|
The `classify` command expects an OpenAI-compatible local chat-completions server.
|
|
|
|
You can pass settings directly:
|
|
|
|
```bash
|
|
python3 -m goodnews classify --base-url http://127.0.0.1:1234/v1 --model gpt-oss --limit 10
|
|
```
|
|
|
|
Or use environment variables:
|
|
|
|
```bash
|
|
export GOODNEWS_LLM_BASE_URL=http://127.0.0.1:1234/v1
|
|
export GOODNEWS_LLM_MODEL=gpt-oss
|
|
python3 -m goodnews classify --limit 10
|
|
```
|
|
|
|
`classify` rewrites the current score/reason row for selected candidates. `rescore` can restore the fast heuristic scores.
|