Sources: per-source paywall override (3-state) — fix domain-rule mis-flags

The Articles inspector revealed paywall is domain-coarse: nytimes.com is flagged,
so NY Times Learning's free Word-of-the-Day inherits 🔒 — and that flag isn't
cosmetic, it deprioritizes the content in feed sort + lead selection. Add a
per-source override so admins can correct it after inspecting.

- sources.paywall_override: NULL (domain rule) | 'free' | 'paywalled'.
- paywall.py: keep low-level is_paywalled(url) (domain); add is_paywalled_for_source
  (url, override) for the EFFECTIVE decision — never patched the domain helper
  globally (per Codex), so "domain says X" stays distinguishable from "overridden".
- Threaded everywhere ranking/UI touches paywall, via src.paywall_override on the
  shared _ARTICLE_COLUMNS + the source-aware helper: feed sort, /api/since, replace,
  lead selection, Article badge, brief composition (briefs.py), digest, source_health
  (table 🔒), the Articles inspector, and the review/attention check — so ranking and
  UI always agree.
- Endpoint POST /api/admin/sources/{id}/paywall {override}; admin UI: a select in the
  inspector header (Use domain rule / Treat as free / Treat as paywalled) + the basis
  ("ON (domain)" / "OFF (override)"), optimistic so the panel stays open.

Test: domain rule → paywalled in table+inspector+feed badge; 'free' → off in all
three; validation 422 + 404. 242 pytest + 11 vitest.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
jay
2026-06-12 22:10:44 -04:00
parent 7279b18fdc
commit 2dbe73430c
9 changed files with 130 additions and 28 deletions
+33
View File
@@ -518,3 +518,36 @@ def test_source_articles_inspector(tmp_path, monkeypatch):
assert tc.get("/api/admin/sources/1/articles?filter=rejected").json()["articles"] == []
assert len(tc.get("/api/admin/sources/1/articles?filter=no_image").json()["articles"]) == 1
assert tc.get("/api/admin/sources/999/articles").status_code == 404 # unknown source
def test_source_paywall_override(tmp_path, monkeypatch):
import sqlite3, os
app, api = _make(tmp_path, monkeypatch, admin_email="boss@x.com")
c = sqlite3.connect(os.environ["GOODNEWS_DB"])
c.execute("INSERT INTO sources (id,name,feed_url,homepage_url,trust_score,content_visible) "
"VALUES (2,'NYT Learning','http://x/f','https://www.nytimes.com/section/learning',5,1)")
c.execute("INSERT INTO articles (id,source_id,canonical_url,title,url_hash) "
"VALUES (2,2,'https://www.nytimes.com/learning/word-of-the-day','WOTD','h2')")
c.execute("INSERT INTO article_scores (article_id,accepted,topic) VALUES (2,1,'culture')")
c.commit(); c.close()
tc = _signin(app, api, "boss@x.com")
def feed_badge():
return next(a for a in tc.get("/api/feed?source_id=2").json()["items"] if a["id"] == 2)["paywalled"]
# domain rule: nytimes.com → paywalled in table, inspector, AND feed badge (all agree)
assert _src(tc, 2)["paywalled"] is True
assert tc.get("/api/admin/sources/2/articles").json()["summary"]["paywalled"] is True
assert feed_badge() is True
# override 'free' (the NYT Learning fix) → effective OFF everywhere
assert tc.post("/api/admin/sources/2/paywall", json={"override": "free"}).json()["override"] == "free"
assert _src(tc, 2)["paywalled"] is False
summ = tc.get("/api/admin/sources/2/articles").json()["summary"]
assert summ["paywalled"] is False and summ["paywall_domain"] is True and summ["paywall_override"] == "free"
assert feed_badge() is False # ranking/badge now agree it's free
# back to domain rule, and the 'paywalled' override
assert tc.post("/api/admin/sources/2/paywall", json={"override": None}).json()["override"] is None
assert _src(tc, 2)["paywalled"] is True
# validation + 404
assert tc.post("/api/admin/sources/2/paywall", json={"override": "bogus"}).status_code == 422
assert tc.post("/api/admin/sources/999/paywall", json={"override": "free"}).status_code == 404