Add systemd timer deployment for scheduled ingestion cycle

- deploy/goodnews.service: oneshot unit running 'goodnews cycle' with a
  generous TimeoutStartSec so long classify runs are not killed.
- deploy/goodnews.timer: every 15 min, Persistent=true to catch missed runs.
- deploy/goodnews.env.example: LLM endpoint + DB path for the scheduled run.
- README: scheduling/install docs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
jay
2026-05-30 14:28:30 +00:00
parent 2414fd3ccb
commit cef272a8fc
4 changed files with 72 additions and 0 deletions
+30
View File
@@ -104,6 +104,36 @@ docker run -p 8000:8000 -v /srv/goodnews/data:/data goodnews
`GOODNEWS_DB` controls the database path (defaults to `data/goodnews.sqlite3`).
Put a reverse proxy (Caddy/nginx) in front for TLS once a domain is attached.
## Scheduling
A single idempotent command runs the whole pipeline and is safe to invoke as
often as you like — it only polls sources that are *due* (per each source's
`poll_interval_minutes`), only classifies articles the model hasn't seen, and
rebuilds the current day's brief:
```bash
python3 -m goodnews cycle # poll due -> classify new -> rebuild today's brief
python3 -m goodnews cycle --force # poll every active source regardless of interval
python3 -m goodnews cycle --no-classify # skip the LLM step (e.g. model box offline)
```
A systemd timer runs it every 15 minutes. Unit files live in `deploy/`:
```bash
sudo install -d /etc/goodnews
sudo install -m 644 deploy/goodnews.env.example /etc/goodnews/goodnews.env # then edit
sudo install -m 644 deploy/goodnews.service deploy/goodnews.timer /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now goodnews.timer
systemctl list-timers goodnews.timer # when it next runs
journalctl -u goodnews.service -f # watch cycle output
```
`/etc/goodnews/goodnews.env` supplies `GOODNEWS_LLM_BASE_URL`, `GOODNEWS_LLM_MODEL`,
and `GOODNEWS_DB` to the scheduled run. The timer uses `Persistent=true`, so a
run missed while the machine was off is caught up on the next boot.
## Next Steps
1. Run the poller for a few days and inspect which sources produce useful candidates.