Add systemd timer deployment for scheduled ingestion cycle
- deploy/goodnews.service: oneshot unit running 'goodnews cycle' with a generous TimeoutStartSec so long classify runs are not killed. - deploy/goodnews.timer: every 15 min, Persistent=true to catch missed runs. - deploy/goodnews.env.example: LLM endpoint + DB path for the scheduled run. - README: scheduling/install docs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -104,6 +104,36 @@ docker run -p 8000:8000 -v /srv/goodnews/data:/data goodnews
|
|||||||
`GOODNEWS_DB` controls the database path (defaults to `data/goodnews.sqlite3`).
|
`GOODNEWS_DB` controls the database path (defaults to `data/goodnews.sqlite3`).
|
||||||
Put a reverse proxy (Caddy/nginx) in front for TLS once a domain is attached.
|
Put a reverse proxy (Caddy/nginx) in front for TLS once a domain is attached.
|
||||||
|
|
||||||
|
## Scheduling
|
||||||
|
|
||||||
|
A single idempotent command runs the whole pipeline and is safe to invoke as
|
||||||
|
often as you like — it only polls sources that are *due* (per each source's
|
||||||
|
`poll_interval_minutes`), only classifies articles the model hasn't seen, and
|
||||||
|
rebuilds the current day's brief:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python3 -m goodnews cycle # poll due -> classify new -> rebuild today's brief
|
||||||
|
python3 -m goodnews cycle --force # poll every active source regardless of interval
|
||||||
|
python3 -m goodnews cycle --no-classify # skip the LLM step (e.g. model box offline)
|
||||||
|
```
|
||||||
|
|
||||||
|
A systemd timer runs it every 15 minutes. Unit files live in `deploy/`:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sudo install -d /etc/goodnews
|
||||||
|
sudo install -m 644 deploy/goodnews.env.example /etc/goodnews/goodnews.env # then edit
|
||||||
|
sudo install -m 644 deploy/goodnews.service deploy/goodnews.timer /etc/systemd/system/
|
||||||
|
sudo systemctl daemon-reload
|
||||||
|
sudo systemctl enable --now goodnews.timer
|
||||||
|
|
||||||
|
systemctl list-timers goodnews.timer # when it next runs
|
||||||
|
journalctl -u goodnews.service -f # watch cycle output
|
||||||
|
```
|
||||||
|
|
||||||
|
`/etc/goodnews/goodnews.env` supplies `GOODNEWS_LLM_BASE_URL`, `GOODNEWS_LLM_MODEL`,
|
||||||
|
and `GOODNEWS_DB` to the scheduled run. The timer uses `Persistent=true`, so a
|
||||||
|
run missed while the machine was off is caught up on the next boot.
|
||||||
|
|
||||||
## Next Steps
|
## Next Steps
|
||||||
|
|
||||||
1. Run the poller for a few days and inspect which sources produce useful candidates.
|
1. Run the poller for a few days and inspect which sources produce useful candidates.
|
||||||
|
|||||||
@@ -0,0 +1,6 @@
|
|||||||
|
# Copy to /etc/goodnews/goodnews.env and adjust. Read by goodnews.service.
|
||||||
|
# These let the scheduled cycle reach the local model and the shared database.
|
||||||
|
|
||||||
|
GOODNEWS_LLM_BASE_URL=http://192.168.50.100:1234/v1
|
||||||
|
GOODNEWS_LLM_MODEL=qwen/qwen3-14b
|
||||||
|
GOODNEWS_DB=/home/jay/goodNews/data/goodnews.sqlite3
|
||||||
@@ -0,0 +1,21 @@
|
|||||||
|
[Unit]
|
||||||
|
Description=goodNews ingestion cycle (poll due sources, classify, rebuild brief)
|
||||||
|
After=network-online.target
|
||||||
|
Wants=network-online.target
|
||||||
|
|
||||||
|
[Service]
|
||||||
|
Type=oneshot
|
||||||
|
User=jay
|
||||||
|
Group=jay
|
||||||
|
WorkingDirectory=/home/jay/goodNews
|
||||||
|
# Optional config (LLM endpoint, DB path). The leading '-' makes it non-fatal
|
||||||
|
# if the file is absent.
|
||||||
|
EnvironmentFile=-/etc/goodnews/goodnews.env
|
||||||
|
ExecStart=/home/jay/goodNews/.venv/bin/python -m goodnews cycle
|
||||||
|
# A cycle may classify dozens of articles through a local model; give it room
|
||||||
|
# so systemd does not kill a long run (oneshot defaults to 90s).
|
||||||
|
TimeoutStartSec=1200
|
||||||
|
Nice=10
|
||||||
|
|
||||||
|
[Install]
|
||||||
|
WantedBy=multi-user.target
|
||||||
@@ -0,0 +1,15 @@
|
|||||||
|
[Unit]
|
||||||
|
Description=Run the goodNews ingestion cycle periodically
|
||||||
|
|
||||||
|
[Timer]
|
||||||
|
# First run shortly after boot, then every 15 minutes. Per-source
|
||||||
|
# poll_interval_minutes still governs which feeds are actually hit each tick.
|
||||||
|
OnBootSec=2min
|
||||||
|
OnUnitActiveSec=15min
|
||||||
|
# Catch up a missed run if the machine was off when one was due.
|
||||||
|
Persistent=true
|
||||||
|
AccuracySec=1min
|
||||||
|
Unit=goodnews.service
|
||||||
|
|
||||||
|
[Install]
|
||||||
|
WantedBy=timers.target
|
||||||
Reference in New Issue
Block a user