diff --git a/README.md b/README.md index e15d62c..7ea9ab6 100644 --- a/README.md +++ b/README.md @@ -104,6 +104,36 @@ docker run -p 8000:8000 -v /srv/goodnews/data:/data goodnews `GOODNEWS_DB` controls the database path (defaults to `data/goodnews.sqlite3`). Put a reverse proxy (Caddy/nginx) in front for TLS once a domain is attached. +## Scheduling + +A single idempotent command runs the whole pipeline and is safe to invoke as +often as you like — it only polls sources that are *due* (per each source's +`poll_interval_minutes`), only classifies articles the model hasn't seen, and +rebuilds the current day's brief: + +```bash +python3 -m goodnews cycle # poll due -> classify new -> rebuild today's brief +python3 -m goodnews cycle --force # poll every active source regardless of interval +python3 -m goodnews cycle --no-classify # skip the LLM step (e.g. model box offline) +``` + +A systemd timer runs it every 15 minutes. Unit files live in `deploy/`: + +```bash +sudo install -d /etc/goodnews +sudo install -m 644 deploy/goodnews.env.example /etc/goodnews/goodnews.env # then edit +sudo install -m 644 deploy/goodnews.service deploy/goodnews.timer /etc/systemd/system/ +sudo systemctl daemon-reload +sudo systemctl enable --now goodnews.timer + +systemctl list-timers goodnews.timer # when it next runs +journalctl -u goodnews.service -f # watch cycle output +``` + +`/etc/goodnews/goodnews.env` supplies `GOODNEWS_LLM_BASE_URL`, `GOODNEWS_LLM_MODEL`, +and `GOODNEWS_DB` to the scheduled run. The timer uses `Persistent=true`, so a +run missed while the machine was off is caught up on the next boot. + ## Next Steps 1. Run the poller for a few days and inspect which sources produce useful candidates. diff --git a/deploy/goodnews.env.example b/deploy/goodnews.env.example new file mode 100644 index 0000000..e0c0027 --- /dev/null +++ b/deploy/goodnews.env.example @@ -0,0 +1,6 @@ +# Copy to /etc/goodnews/goodnews.env and adjust. Read by goodnews.service. +# These let the scheduled cycle reach the local model and the shared database. + +GOODNEWS_LLM_BASE_URL=http://192.168.50.100:1234/v1 +GOODNEWS_LLM_MODEL=qwen/qwen3-14b +GOODNEWS_DB=/home/jay/goodNews/data/goodnews.sqlite3 diff --git a/deploy/goodnews.service b/deploy/goodnews.service new file mode 100644 index 0000000..2d3daf0 --- /dev/null +++ b/deploy/goodnews.service @@ -0,0 +1,21 @@ +[Unit] +Description=goodNews ingestion cycle (poll due sources, classify, rebuild brief) +After=network-online.target +Wants=network-online.target + +[Service] +Type=oneshot +User=jay +Group=jay +WorkingDirectory=/home/jay/goodNews +# Optional config (LLM endpoint, DB path). The leading '-' makes it non-fatal +# if the file is absent. +EnvironmentFile=-/etc/goodnews/goodnews.env +ExecStart=/home/jay/goodNews/.venv/bin/python -m goodnews cycle +# A cycle may classify dozens of articles through a local model; give it room +# so systemd does not kill a long run (oneshot defaults to 90s). +TimeoutStartSec=1200 +Nice=10 + +[Install] +WantedBy=multi-user.target diff --git a/deploy/goodnews.timer b/deploy/goodnews.timer new file mode 100644 index 0000000..f8a7e9c --- /dev/null +++ b/deploy/goodnews.timer @@ -0,0 +1,15 @@ +[Unit] +Description=Run the goodNews ingestion cycle periodically + +[Timer] +# First run shortly after boot, then every 15 minutes. Per-source +# poll_interval_minutes still governs which feeds are actually hit each tick. +OnBootSec=2min +OnUnitActiveSec=15min +# Catch up a missed run if the machine was off when one was due. +Persistent=true +AccuracySec=1min +Unit=goodnews.service + +[Install] +WantedBy=timers.target