Add systemd timer deployment for scheduled ingestion cycle
- deploy/goodnews.service: oneshot unit running 'goodnews cycle' with a generous TimeoutStartSec so long classify runs are not killed. - deploy/goodnews.timer: every 15 min, Persistent=true to catch missed runs. - deploy/goodnews.env.example: LLM endpoint + DB path for the scheduled run. - README: scheduling/install docs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -104,6 +104,36 @@ docker run -p 8000:8000 -v /srv/goodnews/data:/data goodnews
|
||||
`GOODNEWS_DB` controls the database path (defaults to `data/goodnews.sqlite3`).
|
||||
Put a reverse proxy (Caddy/nginx) in front for TLS once a domain is attached.
|
||||
|
||||
## Scheduling
|
||||
|
||||
A single idempotent command runs the whole pipeline and is safe to invoke as
|
||||
often as you like — it only polls sources that are *due* (per each source's
|
||||
`poll_interval_minutes`), only classifies articles the model hasn't seen, and
|
||||
rebuilds the current day's brief:
|
||||
|
||||
```bash
|
||||
python3 -m goodnews cycle # poll due -> classify new -> rebuild today's brief
|
||||
python3 -m goodnews cycle --force # poll every active source regardless of interval
|
||||
python3 -m goodnews cycle --no-classify # skip the LLM step (e.g. model box offline)
|
||||
```
|
||||
|
||||
A systemd timer runs it every 15 minutes. Unit files live in `deploy/`:
|
||||
|
||||
```bash
|
||||
sudo install -d /etc/goodnews
|
||||
sudo install -m 644 deploy/goodnews.env.example /etc/goodnews/goodnews.env # then edit
|
||||
sudo install -m 644 deploy/goodnews.service deploy/goodnews.timer /etc/systemd/system/
|
||||
sudo systemctl daemon-reload
|
||||
sudo systemctl enable --now goodnews.timer
|
||||
|
||||
systemctl list-timers goodnews.timer # when it next runs
|
||||
journalctl -u goodnews.service -f # watch cycle output
|
||||
```
|
||||
|
||||
`/etc/goodnews/goodnews.env` supplies `GOODNEWS_LLM_BASE_URL`, `GOODNEWS_LLM_MODEL`,
|
||||
and `GOODNEWS_DB` to the scheduled run. The timer uses `Persistent=true`, so a
|
||||
run missed while the machine was off is caught up on the next boot.
|
||||
|
||||
## Next Steps
|
||||
|
||||
1. Run the poller for a few days and inspect which sources produce useful candidates.
|
||||
|
||||
@@ -0,0 +1,6 @@
|
||||
# Copy to /etc/goodnews/goodnews.env and adjust. Read by goodnews.service.
|
||||
# These let the scheduled cycle reach the local model and the shared database.
|
||||
|
||||
GOODNEWS_LLM_BASE_URL=http://192.168.50.100:1234/v1
|
||||
GOODNEWS_LLM_MODEL=qwen/qwen3-14b
|
||||
GOODNEWS_DB=/home/jay/goodNews/data/goodnews.sqlite3
|
||||
@@ -0,0 +1,21 @@
|
||||
[Unit]
|
||||
Description=goodNews ingestion cycle (poll due sources, classify, rebuild brief)
|
||||
After=network-online.target
|
||||
Wants=network-online.target
|
||||
|
||||
[Service]
|
||||
Type=oneshot
|
||||
User=jay
|
||||
Group=jay
|
||||
WorkingDirectory=/home/jay/goodNews
|
||||
# Optional config (LLM endpoint, DB path). The leading '-' makes it non-fatal
|
||||
# if the file is absent.
|
||||
EnvironmentFile=-/etc/goodnews/goodnews.env
|
||||
ExecStart=/home/jay/goodNews/.venv/bin/python -m goodnews cycle
|
||||
# A cycle may classify dozens of articles through a local model; give it room
|
||||
# so systemd does not kill a long run (oneshot defaults to 90s).
|
||||
TimeoutStartSec=1200
|
||||
Nice=10
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
@@ -0,0 +1,15 @@
|
||||
[Unit]
|
||||
Description=Run the goodNews ingestion cycle periodically
|
||||
|
||||
[Timer]
|
||||
# First run shortly after boot, then every 15 minutes. Per-source
|
||||
# poll_interval_minutes still governs which feeds are actually hit each tick.
|
||||
OnBootSec=2min
|
||||
OnUnitActiveSec=15min
|
||||
# Catch up a missed run if the machine was off when one was due.
|
||||
Persistent=true
|
||||
AccuracySec=1min
|
||||
Unit=goodnews.service
|
||||
|
||||
[Install]
|
||||
WantedBy=timers.target
|
||||
Reference in New Issue
Block a user