diff --git a/docs/images-and-visitor-metrics.md b/docs/images-and-visitor-metrics.md index b2803c3..7d18c81 100644 --- a/docs/images-and-visitor-metrics.md +++ b/docs/images-and-visitor-metrics.md @@ -43,6 +43,9 @@ universal thumbnail exemption. So re-hosting waits on a per-source rights basis. (`w*h > _MAX_PIXELS`) **before** decode. Originals are never retained. - Bounded: hard size cap (default 1 GB, `GOODNEWS_IMG_CACHE_CAP`) with LRU eviction; `.fail` markers swept after `_FAIL_TTL_S`. `data/img_cache/` is gitignored (runtime data). +- **Revocation:** when a source leaves `cache` (set to `remote`/`none` in admin), the endpoint + calls `newsimg.purge_source()` to delete that source's re-hosted copies **immediately** — they + don't linger on disk. (Setting *to* `cache` just flips the flag; the cycle warms it.) ## Visitor metrics — Recorded visits vs Engaged readers @@ -59,4 +62,15 @@ A JS-capable bot can trip the visit beacon, so the admin shows two numbers: - **Never** counts auto-fired `visit`/`summary_viewed`/`open`, `replace_none`, or game `*_arrival`. - Defined by `queries.ENGAGED_EVENT_KINDS`; surfaced as `visitors.engaged_today/d7/d30`. +**Warm-up caveat:** the `engaged` beacon began **2026-06-30**, so rolling windows fill over time — +a low `engaged_d7`/`engaged_d30` is partly warm-up, NOT proof the gap to recorded visits was all +bots. Compare `d7` after a full week, `d30` after thirty days. (Admin shows this note inline.) + Privacy unchanged: only a salted `visitor_hash` is stored (no IP, no raw token, no fingerprint). + +### Optional (not done) — homepage hero referrer +For `remote` images, article cards and the share page use ``, so +the publisher CDN doesn't get the referring URL. The homepage hero (`.news-plate`) is a CSS +`background-image`, which can't carry that policy, so it leaks the referrer (not the IP — that's +unavoidable for any remote image). Converting the hero to a real `` +would make it consistent. Deferred pending an owner decision (touches the cover/contain hero rendering).