Admin: source Articles inspector (verify metrics against real evidence)

New per-row "Articles" button on the Sources table expands a read-only inline
panel of the source's ACTUAL ingested articles — so the automated metrics
(paywall/image/acceptance/duplicate) can be verified against evidence instead of
trusted blind. Distinct from "Check" (which re-samples the LIVE feed for
would-pass quality); this shows what's already in the DB, which is what the table
metrics are computed from.

- Backend: GET /api/admin/sources/{id}/articles?filter=&limit=&offset= (admin,
  read-only). queries.source_articles + source_articles_summary — per article:
  title, url, date, accepted, reason (the "why"), topic/flavor, paywalled
  (domain rule), has_image, duplicate. Summary = counts + source-level paywall
  rule.
- Frontend: expandable panel with a summary header ("27 ingested · 18 accepted
  · … · paywall rule: ON (domain)"), filter chips (All/Accepted/Rejected/No
  image/Duplicates), compact rows with title→link + badges + reason, Load more.

So "100% paywall" or "0% images" becomes clickable evidence: open two articles
to tell a real paywall from a mis-flagged domain, or a true image gap from an
enrichment failure. Test: test_source_articles_inspector. 241 pytest + 11 vitest.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
jay
2026-06-12 21:37:51 -04:00
parent 64339aafb0
commit ddcfab3a11
9 changed files with 445 additions and 61 deletions
+38 -7
View File
@@ -193,7 +193,7 @@
{#each Array(length) as _, c (c)} {#each Array(length) as _, c (c)}
{@const ch = g ? g[c] : (r === guesses.length ? current[c] : '')} {@const ch = g ? g[c] : (r === guesses.length ? current[c] : '')}
<div class="tile {cs ? cs[c] : ''}" class:filled={!!ch} <div class="tile {cs ? cs[c] : ''}" class:filled={!!ch}
style={cs ? `animation-delay:${c * 0.08}s` : ''}>{(ch || '').toUpperCase()}</div> style={cs ? `--d:${c * 0.08}s` : ''}>{(ch || '').toUpperCase()}</div>
{/each} {/each}
</div> </div>
{/each} {/each}
@@ -253,22 +253,53 @@
aspect-ratio: 1; display: flex; align-items: center; justify-content: center; aspect-ratio: 1; display: flex; align-items: center; justify-content: center;
border: 2px solid var(--line); border-radius: 8px; font-family: var(--label); border: 2px solid var(--line); border-radius: 8px; font-family: var(--label);
font-weight: 700; font-size: 1.5rem; color: var(--ink); text-transform: uppercase; font-weight: 700; font-size: 1.5rem; color: var(--ink); text-transform: uppercase;
background: var(--surface); background: var(--surface); position: relative; overflow: hidden;
} }
.tile.filled { border-color: #b7c0cb; } .tile.filled { border-color: #b7c0cb; }
.tile.correct { background: #4a9d6e; border-color: #4a9d6e; color: #fff; } /* Judged tiles set like glazed enamel: a soft top-light gradient over the
.tile.present { background: #d8b24a; border-color: #d8b24a; color: #fff; } colour, an inner bevel, and a little lift off the board. Pending tiles stay
.tile.absent { background: #9aa6b2; border-color: #9aa6b2; color: #fff; } flat on purpose — depth marks the moment a letter is settled. */
.tile.correct, .tile.present, .tile.absent { color: #fff; text-shadow: 0 1px 2px rgba(20, 30, 25, 0.22); }
.tile.correct {
background: linear-gradient(180deg, rgba(255, 255, 255, 0.2), rgba(255, 255, 255, 0) 45%),
linear-gradient(165deg, #56ac7c, #4a9d6e 55%, #3e8a5e);
border-color: #3e8a5e;
box-shadow: inset 0 1px 0 rgba(255, 255, 255, 0.32), inset 0 -3px 5px rgba(22, 64, 42, 0.2),
0 2px 5px rgba(58, 125, 86, 0.28);
}
.tile.present {
background: linear-gradient(180deg, rgba(255, 255, 255, 0.2), rgba(255, 255, 255, 0) 45%),
linear-gradient(165deg, #e2c163, #d8b24a 55%, #c29c38);
border-color: #c29c38;
box-shadow: inset 0 1px 0 rgba(255, 255, 255, 0.35), inset 0 -3px 5px rgba(122, 92, 22, 0.2),
0 2px 5px rgba(184, 148, 58, 0.28);
}
.tile.absent {
background: linear-gradient(180deg, rgba(255, 255, 255, 0.16), rgba(255, 255, 255, 0) 45%),
linear-gradient(165deg, #a7b2bd, #9aa6b2 55%, #87939f);
border-color: #87939f;
box-shadow: inset 0 1px 0 rgba(255, 255, 255, 0.26), inset 0 -3px 5px rgba(50, 60, 70, 0.16),
0 2px 4px rgba(110, 122, 134, 0.24);
}
/* One quiet glint sweeps each tile just after its flip lands — once, not a loop. */
.tile.correct::after, .tile.present::after, .tile.absent::after {
content: ''; position: absolute; inset: 0; pointer-events: none;
background: linear-gradient(115deg, transparent 38%, rgba(255, 255, 255, 0.38) 50%, transparent 62%);
transform: translateX(-130%);
animation: sheen 0.65s ease-out both;
animation-delay: calc(var(--d, 0s) + 0.32s);
}
/* Juice: a tile pops as you type; the row reveals with a staggered bounce when /* Juice: a tile pops as you type; the row reveals with a staggered bounce when
you submit; the row shakes on an invalid word. */ you submit; the row shakes on an invalid word. */
.tile.filled:not(.correct):not(.present):not(.absent) { animation: pop 0.13s ease; } .tile.filled:not(.correct):not(.present):not(.absent) { animation: pop 0.13s ease; }
.tile.correct, .tile.present, .tile.absent { animation: reveal 0.34s ease both; } .tile.correct, .tile.present, .tile.absent { animation: reveal 0.34s ease both; animation-delay: var(--d, 0s); }
.row.shake { animation: shake 0.4s ease; } .row.shake { animation: shake 0.4s ease; }
@keyframes pop { 0% { transform: scale(1); } 45% { transform: scale(1.09); } 100% { transform: scale(1); } } @keyframes pop { 0% { transform: scale(1); } 45% { transform: scale(1.09); } 100% { transform: scale(1); } }
@keyframes reveal { 0% { transform: scale(0.5); opacity: 0.3; } 55% { transform: scale(1.12); } 100% { transform: scale(1); opacity: 1; } } @keyframes reveal { 0% { transform: scale(0.5); opacity: 0.3; } 55% { transform: scale(1.12); } 100% { transform: scale(1); opacity: 1; } }
@keyframes sheen { to { transform: translateX(130%); } }
@keyframes shake { 0%, 100% { transform: translateX(0); } 20% { transform: translateX(-7px); } 40% { transform: translateX(7px); } 60% { transform: translateX(-5px); } 80% { transform: translateX(5px); } } @keyframes shake { 0%, 100% { transform: translateX(0); } 20% { transform: translateX(-7px); } 40% { transform: translateX(7px); } 60% { transform: translateX(-5px); } 80% { transform: translateX(5px); } }
@media (prefers-reduced-motion: reduce) { .tile, .row { animation: none !important; } } @media (prefers-reduced-motion: reduce) { .tile, .tile::after, .row { animation: none !important; } }
.flash { .flash {
text-align: center; background: var(--ink); color: #fff; border-radius: 8px; text-align: center; background: var(--ink); color: #fff; border-radius: 8px;
padding: 7px 14px; width: fit-content; margin: 0 auto 12px; font-size: 0.86rem; padding: 7px 14px; width: fit-content; margin: 0 auto 12px; font-size: 0.86rem;
@@ -16,7 +16,8 @@
let foundWords = $state([]); // {word, cells:[[r,c]], ci} let foundWords = $state([]); // {word, cells:[[r,c]], ci}
let sel = $state([]); // current selection cells let sel = $state([]); // current selection cells
let selecting = false; let selecting = false;
let startTime = 0; let playedMs = 0; // accumulated ACTIVE play time (closed segments)
let segStart = 0; // wall-clock start of the open segment (0 = paused)
let resultMs = $state(0); let resultMs = $state(0);
let best = $state(0); let best = $state(0);
let loading = $state(true); let loading = $state(true);
@@ -63,16 +64,44 @@
return m; return m;
}); });
// --- the clock counts ACTIVE play only -----------------------------------
// Wall-clock timing made "finish in one sitting" feel mandatory — the
// opposite of calm. The clock runs only while the puzzle is on screen
// (tab visible, window focused, game unfinished); stepping away pauses it,
// coming back resumes it, and several sittings simply add up.
function playedNow() { return playedMs + (segStart ? Date.now() - segStart : 0); }
function pauseClock(save = true) {
if (!segStart) return;
playedMs += Date.now() - segStart; segStart = 0;
if (save) persist(); // don't lose the segment if the tab dies
}
function resumeClock() {
if (!segStart && !loading && status === 'playing' && !document.hidden) segStart = Date.now();
}
$effect(() => {
const onVis = () => (document.hidden ? pauseClock() : resumeClock());
const onAway = () => pauseClock();
document.addEventListener('visibilitychange', onVis);
window.addEventListener('pagehide', onAway);
window.addEventListener('blur', onAway);
window.addEventListener('focus', onVis);
return () => {
document.removeEventListener('visibilitychange', onVis);
window.removeEventListener('pagehide', onAway);
window.removeEventListener('blur', onAway);
window.removeEventListener('focus', onVis);
};
});
async function load() { async function load() {
const seq = ++loadSeq; // stale-load guard for rapid size switches const seq = ++loadSeq; // stale-load guard for rapid size switches
loading = true; ready = false; loading = true; ready = false;
foundWords = []; sel = []; resultMs = 0; startTime = 0; foundWords = []; sel = []; resultMs = 0; playedMs = 0; segStart = 0;
try { try {
const p = await getJSON('/api/puzzle/wordsearch?variant=' + size); const p = await getJSON('/api/puzzle/wordsearch?variant=' + size);
if (seq !== loadSeq) return; // a newer size was selected — abandon if (seq !== loadSeq) return; // a newer size was selected — abandon
theme = p.theme; words = p.words; grid = p.grid; date = p.date; theme = p.theme; words = p.words; grid = p.grid; date = p.date;
restore(); restore();
if (!startTime) startTime = Date.now();
try { best = JSON.parse(localStorage.getItem(bestKey) || '0'); } catch { best = 0; } try { best = JSON.parse(localStorage.getItem(bestKey) || '0'); } catch { best = 0; }
} catch { } catch {
if (seq !== loadSeq) return; if (seq !== loadSeq) return;
@@ -80,6 +109,7 @@
} }
if (seq !== loadSeq) return; if (seq !== loadSeq) return;
loading = false; loading = false;
resumeClock();
requestAnimationFrame(() => (ready = true)); requestAnimationFrame(() => (ready = true));
// Reconcile with the server in the background (signed-in only): pull any // Reconcile with the server in the background (signed-in only): pull any
// progress from another device, and pull the cross-device best time. // progress from another device, and pull the cross-device best time.
@@ -103,14 +133,14 @@
const s = JSON.parse(localStorage.getItem(stateKey) || 'null'); const s = JSON.parse(localStorage.getItem(stateKey) || 'null');
if (s && Array.isArray(s.foundWords)) { if (s && Array.isArray(s.foundWords)) {
foundWords = validFinds(s.foundWords); foundWords = validFinds(s.foundWords);
startTime = s.startTime || 0; playedMs = s.played || 0; // pre-"active clock" saves restart at 0:00 — the kind direction
resultMs = foundWords.length === words.length ? (s.ms || 0) : 0; resultMs = foundWords.length === words.length ? (s.ms || 0) : 0;
} }
} catch { /* ignore */ } } catch { /* ignore */ }
onstatus?.(summary()); onstatus?.(summary());
} }
function persist() { function persist() {
try { localStorage.setItem(stateKey, JSON.stringify({ foundWords, startTime, ms: resultMs, status })); } try { localStorage.setItem(stateKey, JSON.stringify({ foundWords, played: playedNow(), ms: resultMs, status })); }
catch { /* ignore */ } catch { /* ignore */ }
onstatus?.(summary()); onstatus?.(summary());
} }
@@ -125,13 +155,17 @@
} }
// renumber colours by find order so overlap blends stay consistent // renumber colours by find order so overlap blends stay consistent
foundWords = foundWords.map((fw, i) => ({ ...fw, ci: i % PALETTE.length })); foundWords = foundWords.map((fw, i) => ({ ...fw, ci: i % PALETTE.length }));
if (merged.startTime && (!startTime || merged.startTime < startTime)) startTime = merged.startTime; // another device may have accumulated more active time — credit the larger
if ((merged.played || 0) > playedNow()) {
playedMs = merged.played;
if (segStart) segStart = Date.now();
}
if (foundWords.length === words.length && merged.ms) resultMs = Math.min(resultMs || merged.ms, merged.ms); if (foundWords.length === words.length && merged.ms) resultMs = Math.min(resultMs || merged.ms, merged.ms);
persist(); persist();
} }
async function syncNow() { async function syncNow() {
const d = date, sz = size; // pin against a size switch mid-flight const d = date, sz = size; // pin against a size switch mid-flight
const merged = await pushGameState('wordsearch', sz, d, { foundWords, startTime, ms: resultMs }); const merged = await pushGameState('wordsearch', sz, d, { foundWords, played: playedNow(), ms: resultMs });
if (d === date && sz === size) adopt(merged); // ignore if the user switched away if (d === date && sz === size) adopt(merged); // ignore if the user switched away
} }
function syncSoon() { clearTimeout(syncTimer); syncTimer = setTimeout(syncNow, 1200); } function syncSoon() { clearTimeout(syncTimer); syncTimer = setTimeout(syncNow, 1200); }
@@ -144,7 +178,7 @@
function down(e) { function down(e) {
if (status === 'done') return; if (status === 'done') return;
selecting = true; selecting = true;
if (!startTime) startTime = Date.now(); resumeClock(); // safety net if a focus event was missed
sel = [cellAt(e)]; sel = [cellAt(e)];
gridEl.setPointerCapture?.(e.pointerId); gridEl.setPointerCapture?.(e.pointerId);
e.preventDefault(); e.preventDefault();
@@ -171,7 +205,8 @@
} }
function finish() { function finish() {
resultMs = startTime ? Date.now() - startTime : 0; pauseClock(false); // close the open segment; persist follows in evaluate()
resultMs = playedMs;
if (resultMs && (!best || resultMs < best)) { if (resultMs && (!best || resultMs < best)) {
best = resultMs; best = resultMs;
try { localStorage.setItem(bestKey, JSON.stringify(best)); } catch { /* ignore */ } try { localStorage.setItem(bestKey, JSON.stringify(best)); } catch { /* ignore */ }
+87
View File
@@ -263,6 +263,27 @@
} }
function dismissCheck(s) { s._check = null; s._checkErr = ''; } function dismissCheck(s) { s._check = null; s._checkErr = ''; }
// --- Source article inspector: the real articles behind the metrics ---
async function toggleArticles(s) {
if (s._showArts) { s._showArts = false; return; }
s._showArts = true;
if (!s._arts) await loadArticles(s, 'all', true);
}
async function loadArticles(s, filter, reset) {
s._artBusy = true; s._artErr = '';
if (reset) { s._artFilter = filter; s._artOffset = 0; s._arts = []; }
try {
const q = `filter=${s._artFilter}&limit=25&offset=${s._artOffset}`;
const r = await getJSON(`/api/admin/sources/${s.id}/articles?${q}`);
s._arts = reset ? r.articles : [...(s._arts || []), ...r.articles];
if (r.summary) s._artSummary = r.summary;
s._artMore = r.has_more;
s._artOffset += r.articles.length;
} catch (e) { s._artErr = e?.message || 'Could not load articles.'; }
finally { s._artBusy = false; }
}
const ART_FILTERS = [['all', 'All'], ['accepted', 'Accepted'], ['rejected', 'Rejected'], ['no_image', 'No image'], ['duplicates', 'Duplicates']];
// --- Source candidates: supervised "add a source" pipeline --- // --- Source candidates: supervised "add a source" pipeline ---
let candidates = $state([]); let candidates = $state([]);
let newFeedUrl = $state(''); let newFeedUrl = $state('');
@@ -677,6 +698,7 @@
<button class="act" onclick={() => (s.review_flag ? clearReview(s) : openFlag(s))}>{s.review_flag ? 'Clear' : 'Flag'}</button> <button class="act" onclick={() => (s.review_flag ? clearReview(s) : openFlag(s))}>{s.review_flag ? 'Clear' : 'Flag'}</button>
<button class="act" onclick={() => toggleVisible(s)}>{s.content_visible ? 'Hide' : 'Show'}</button> <button class="act" onclick={() => toggleVisible(s)}>{s.content_visible ? 'Hide' : 'Show'}</button>
<button class="act" title="Read-only spot-check of the live feed" onclick={() => checkSource(s)} disabled={s._checking}>{s._checking ? 'Checking…' : 'Check'}</button> <button class="act" title="Read-only spot-check of the live feed" onclick={() => checkSource(s)} disabled={s._checking}>{s._checking ? 'Checking…' : 'Check'}</button>
<button class="act" title="Inspect this source's real ingested articles" onclick={() => toggleArticles(s)}>{s._showArts ? 'Hide' : 'Articles'}</button>
</td> </td>
</tr> </tr>
{#if s._checking || s._check || s._checkErr} {#if s._checking || s._check || s._checkErr}
@@ -698,6 +720,49 @@
</td> </td>
</tr> </tr>
{/if} {/if}
{#if s._showArts}
<tr class="artrow">
<td colspan="10">
{#if s._artSummary}
<div class="artsum">
<strong>{s._artSummary.total}</strong> ingested · {s._artSummary.accepted} accepted ·
{s._artSummary.rejected} rejected · {s._artSummary.no_image} no image ·
{s._artSummary.duplicates} dup ·
paywall rule: <span class="pwrule" class:on={s._artSummary.paywalled}>{s._artSummary.paywalled ? 'ON (domain)' : 'off'}</span>
</div>
{/if}
<div class="artfilters">
{#each ART_FILTERS as [key, label] (key)}
<button class="chip sm" class:on={(s._artFilter || 'all') === key} onclick={() => loadArticles(s, key, true)}>{label}</button>
{/each}
<button class="act" onclick={() => (s._showArts = false)}>close</button>
</div>
{#if s._artErr}<p class="cerr">{s._artErr}</p>{/if}
{#if s._arts?.length}
<ul class="artlist">
{#each s._arts as a (a.id)}
<li>
<div class="art-row">
<a class="art-title" href={a.url} target="_blank" rel="noopener">{a.title}</a>
{#if a.accepted === 1}<span class="badge ok">accepted</span>{:else if a.accepted === 0}<span class="badge no">rejected</span>{/if}
{#if a.paywalled}<span class="pw" title="domain paywall rule">🔒</span>{/if}
{#if !a.has_image}<span class="art-flag" title="no image extracted">no img</span>{/if}
{#if a.duplicate}<span class="art-flag" title="marked duplicate">dup</span>{/if}
{#if a.topic}<span class="art-cat">{a.topic}</span>{/if}
<span class="art-when">{a.published_at ? fdate(a.published_at) : ''}</span>
</div>
{#if a.reason}<div class="art-reason">{a.reason}</div>{/if}
</li>
{/each}
</ul>
{#if s._artMore}<button class="act more" onclick={() => loadArticles(s, s._artFilter, false)} disabled={s._artBusy}>{s._artBusy ? 'Loading…' : 'Load more'}</button>{/if}
{:else if !s._artBusy}
<p class="muted small">No articles{s._artFilter && s._artFilter !== 'all' ? ' match this filter' : ' yet'}.</p>
{/if}
{#if s._artBusy && !s._arts?.length}<p class="muted small">Loading articles…</p>{/if}
</td>
</tr>
{/if}
{:else} {:else}
<tr><td colspan="10" class="srcempty">{srcSearch.trim() ? `No sources match “${srcSearch.trim()}.` : 'No sources in this view.'}</td></tr> <tr><td colspan="10" class="srcempty">{srcSearch.trim() ? `No sources match “${srcSearch.trim()}.` : 'No sources in this view.'}</td></tr>
{/each} {/each}
@@ -1188,6 +1253,28 @@
.chkex { margin-top: 5px; color: var(--ink); } .chkex { margin-top: 5px; color: var(--ink); }
.chkex .chklbl { color: var(--muted); } .chkex .chklbl { color: var(--muted); }
.chkex.chkrej { color: var(--muted); } .chkex.chkrej { color: var(--muted); }
/* Source article inspector */
.srctable tr.artrow td { background: var(--bg); font-size: 0.84rem; padding: 10px 12px; }
.artsum { color: var(--ink); margin-bottom: 8px; }
.artsum .pwrule { color: var(--muted); font-weight: 600; }
.artsum .pwrule.on { color: #9a3b3b; }
.artfilters { display: flex; gap: 6px; flex-wrap: wrap; align-items: center; margin-bottom: 8px; }
.chip.sm { font-size: 0.74rem; padding: 3px 10px; }
.artlist { list-style: none; margin: 0; padding: 0; display: flex; flex-direction: column; gap: 7px; max-height: 360px; overflow-y: auto; }
.artlist li { border-bottom: 1px solid var(--line); padding-bottom: 6px; }
.art-row { display: flex; align-items: center; gap: 8px; flex-wrap: wrap; }
.art-title { color: var(--accent-deep); font-weight: 600; text-decoration: none; }
.art-title:hover { text-decoration: underline; }
.art-row .badge { font-size: 0.66rem; font-weight: 700; text-transform: uppercase; letter-spacing: 0.04em; padding: 1px 7px; border-radius: 999px; }
.badge.ok { background: #e3efe4; color: #3f7048; }
.badge.no { background: #f3e0e0; color: #9a3b3b; }
.art-row .pw { font-size: 0.78rem; }
.art-flag { font-size: 0.7rem; color: var(--muted); border: 1px solid var(--line); border-radius: 999px; padding: 0 7px; }
.art-cat { font-size: 0.72rem; color: var(--muted); text-transform: capitalize; }
.art-when { font-size: 0.72rem; color: var(--muted); margin-left: auto; white-space: nowrap; }
.art-reason { font-size: 0.76rem; color: var(--muted); font-style: italic; margin-top: 2px; }
.act.more { margin-top: 8px; }
.srctable .rowactions { white-space: nowrap; } .srctable .rowactions { white-space: nowrap; }
.srctable .rowactions .act { .srctable .rowactions .act {
background: none; border: 1px solid var(--line); color: var(--accent-deep); background: none; border: 1px solid var(--line); color: var(--accent-deep);
+18
View File
@@ -1146,6 +1146,24 @@ def create_app() -> FastAPI:
url = src["feed_url"] url = src["feed_url"]
return _preview_or_502(url) # safe fetch, no DB connection held return _preview_or_502(url) # safe fetch, no DB connection held
@app.get("/api/admin/sources/{sid}/articles")
def admin_source_articles(sid: int, request: Request, filter: str = "all",
limit: int = 25, offset: int = 0) -> dict:
# Read-only inspector: the REAL ingested articles behind a source's metrics,
# so paywall/image/acceptance/duplicate signals can be verified against evidence.
limit = max(1, min(int(limit), 100))
offset = max(0, int(offset))
with get_conn() as conn:
_require_admin(conn, request)
if not conn.execute("SELECT 1 FROM sources WHERE id = ?", (sid,)).fetchone():
raise HTTPException(status_code=404, detail="source not found")
arts = queries.source_articles(conn, sid, filter, limit, offset)
return {
"articles": arts,
"summary": queries.source_articles_summary(conn, sid) if offset == 0 else None,
"has_more": len(arts) == limit,
}
# --- Source candidates (supervised add-a-source pipeline) ---------------- # --- Source candidates (supervised add-a-source pipeline) ----------------
def _candidate_dict(row) -> dict: def _candidate_dict(row) -> dict:
+79 -34
View File
@@ -518,42 +518,79 @@ def generate_wordsearch_puzzle(conn: sqlite3.Connection, date: str, client=None)
return json.loads(row["payload_json"]) return json.loads(row["payload_json"])
_WS_CROSS_TARGET = 0.5 # aim: about half the placements cross an existing word
def _zone(r: int, c: int, size: int) -> tuple[int, int]:
"""Which quadrant a cell falls in — coarse occupancy used to spread words."""
return (r * 2 // size, c * 2 // size)
def _place_words(words: list[str], size: int, seed: int) -> tuple[list[list[str | None]], list[tuple[str, list[tuple[int, int]]]]]:
"""Core placement (date-seeded, deterministic). Returns the letter grid (None
where unfilled) and [(word, cells)] for every word genuinely placed.
Interlock is a TARGET, not a side effect: each word either (a) must cross an
already-placed word when crossings are running below ~half of placements
or (b) anchors in open ground. Both modes steer toward the least crowded /
least developed quadrant, so crossings attach to lonely words at the edges of
structure rather than thickening one knot, and anchors spread across the
board. All valid spots are enumerated (the grid is tiny) earlier random
sampling kept missing the rare crossing spots, which is why grids came out
as disconnected "clean" words."""
rng = random.Random(seed)
grid: list[list[str | None]] = [[None] * size for _ in range(size)]
zone_fill = {(zr, zc): 0 for zr in (0, 1) for zc in (0, 1)}
placements: list[tuple[str, list[tuple[int, int]]]] = []
crossed = 0
for word in sorted(words, key=len, reverse=True):
n = len(word)
if n > size:
continue
cands = [] # (overlap, cells) over every legal placement
for dr, dc in _DIRS:
for r0 in range(size):
for c0 in range(size):
if not (0 <= r0 + dr * (n - 1) < size and 0 <= c0 + dc * (n - 1) < size):
continue
cells = [(r0 + dr * i, c0 + dc * i) for i in range(n)]
if not all(grid[r][c] in (None, word[i]) for i, (r, c) in enumerate(cells)):
continue
cands.append((sum(1 for i, (r, c) in enumerate(cells) if grid[r][c] == word[i]), cells))
if not cands:
continue
crossing = [t for t in cands if t[0] > 0]
want_cross = bool(crossing) and crossed < _WS_CROSS_TARGET * len(placements)
scored = [] # (score, overlap, cells)
for overlap, cells in crossing if want_cross else cands:
crowd = _neighbour_fill(grid, cells, size)
zload = sum(zone_fill[_zone(r, c, size)] for r, c in cells) // n
# Crossing mode rewards extra overlaps; anchor mode is overlap-neutral
# (crowding already steers it to open ground).
scored.append(((overlap * 4 if want_cross else 0) - 2 * crowd - zload, overlap, cells))
scored.sort(key=lambda t: t[0], reverse=True)
top = [t for t in scored if t[0] >= scored[0][0] - 1] # near-best: variety without losing intent
_, overlap, cells = rng.choice(top)
for i, (r, c) in enumerate(cells):
if grid[r][c] is None:
grid[r][c] = word[i]
zone_fill[_zone(r, c, size)] += 1
placements.append((word, cells))
if overlap:
crossed += 1
return grid, placements
def _build_grid(words: list[str], size: int, seed: int) -> tuple[list[str], list[str]]: def _build_grid(words: list[str], size: int, seed: int) -> tuple[list[str], list[str]]:
"""Place words in a size×size grid (date-seeded, deterministic) and fill the """Place words in a size×size grid (date-seeded, deterministic) and fill the
rest. Returns (rows, placed_words). Every returned word is genuinely placed.""" rest. Returns (rows, placed_words). Every returned word is genuinely placed."""
rng = random.Random(seed) grid, placements = _place_words(words, size, seed)
grid: list[list[str | None]] = [[None] * size for _ in range(size)] rng = random.Random(_seed(str(seed), "fill"))
placed = []
for word in sorted(words, key=len, reverse=True):
if len(word) > size:
continue
# Gather valid placements and SCORE them: reward crossing an existing word
# (so the grid interlocks like a real puzzle) but penalise crowding, so
# words spread across the board instead of all clustering around the ones
# placed first. Pick at random among the best ~20% to keep organic variety.
scored = [] # (score, cells)
for _ in range(400):
dr, dc = rng.choice(_DIRS)
r0, c0 = rng.randrange(size), rng.randrange(size)
cells = [(r0 + dr * i, c0 + dc * i) for i in range(len(word))]
if any(not (0 <= r < size and 0 <= c < size) for r, c in cells):
continue
if not all(grid[r][c] in (None, word[i]) for i, (r, c) in enumerate(cells)):
continue
overlap = sum(1 for i, (r, c) in enumerate(cells) if grid[r][c] == word[i])
scored.append((overlap * 4 - _neighbour_fill(grid, cells, size), cells))
if not scored:
continue
scored.sort(key=lambda t: t[0], reverse=True)
_, cells = rng.choice(scored[: max(1, len(scored) // 5)])
for i, (r, c) in enumerate(cells):
grid[r][c] = word[i]
placed.append(word)
for r in range(size): for r in range(size):
for c in range(size): for c in range(size):
if grid[r][c] is None: if grid[r][c] is None:
grid[r][c] = chr(65 + rng.randrange(26)) grid[r][c] = chr(65 + rng.randrange(26))
return ["".join(row) for row in grid], placed return ["".join(row) for row in grid], [w for w, _ in placements]
# --- Cross-device game state sync ------------------------------------------- # --- Cross-device game state sync -------------------------------------------
@@ -562,17 +599,18 @@ def _build_grid(words: list[str], size: int, seed: int) -> tuple[list[str], list
def _merge_wordsearch(a: dict, b: dict) -> dict: def _merge_wordsearch(a: dict, b: dict) -> dict:
"""Union the found words (a find is monotonic — you can't un-find one, so the """Union the found words (a find is monotonic — you can't un-find one, so the
union is always correct), keep the earliest start and the best (min) time.""" union is always correct), credit the most ACTIVE play time either device has
banked (max the clock only runs while the puzzle is on screen, so wall-clock
gaps between sittings never count), and keep the best (min) finish time."""
by_word = {} by_word = {}
for fw in list(a.get("foundWords") or []) + list(b.get("foundWords") or []): for fw in list(a.get("foundWords") or []) + list(b.get("foundWords") or []):
w = fw.get("word") if isinstance(fw, dict) else None w = fw.get("word") if isinstance(fw, dict) else None
if w and w not in by_word: if w and w not in by_word:
by_word[w] = fw by_word[w] = fw
starts = [s for s in (a.get("startTime"), b.get("startTime")) if s]
times = [m for m in (a.get("ms"), b.get("ms")) if m] times = [m for m in (a.get("ms"), b.get("ms")) if m]
return { return {
"foundWords": list(by_word.values()), "foundWords": list(by_word.values()),
"startTime": min(starts) if starts else 0, "played": max(_int(a.get("played")), _int(b.get("played"))),
"ms": min(times) if times else 0, "ms": min(times) if times else 0,
} }
@@ -615,6 +653,13 @@ def _int(x) -> int:
return 0 return 0
_WS_MS_CAP = 86_400_000 # clamp client-sent timings to one day — beyond that is junk
def _ms(x) -> int:
return max(0, min(_int(x), _WS_MS_CAP))
def _sanitize_wordsearch(conn: sqlite3.Connection, variant: str, date: str, state: dict) -> dict: def _sanitize_wordsearch(conn: sqlite3.Connection, variant: str, date: str, state: dict) -> dict:
"""Trust only finds that are real for THIS puzzle: word in the day's list and """Trust only finds that are real for THIS puzzle: word in the day's list and
cells that actually spell it in the grid (validated when the puzzle exists, cells that actually spell it in the grid (validated when the puzzle exists,
@@ -656,8 +701,8 @@ def _sanitize_wordsearch(conn: sqlite3.Connection, variant: str, date: str, stat
seen.add(w) seen.add(w)
clean.append({"word": w, "cells": cells, "ci": len(clean) % 10}) clean.append({"word": w, "cells": cells, "ci": len(clean) % 10})
done = bool(words) and len(clean) == len(words) done = bool(words) and len(clean) == len(words)
return {"foundWords": clean, "startTime": _int(state.get("startTime")), return {"foundWords": clean, "played": _ms(state.get("played")),
"ms": _int(state.get("ms")) if done else 0} "ms": _ms(state.get("ms")) if done else 0}
_WORD_COLOURS = {"absent", "present", "correct"} _WORD_COLOURS = {"absent", "present", "correct"}
+69
View File
@@ -454,6 +454,75 @@ def _attention(content: dict, sources: list[dict], feedback_unread: int, now: da
return items return items
# --- Source article inspector: the real articles behind the source metrics -----
_SRC_ART_FILTERS = {
"accepted": "AND s.accepted = 1",
"rejected": "AND s.accepted = 0",
"no_image": "AND (a.image_url IS NULL OR a.image_url = '')",
"duplicates": "AND a.duplicate_of IS NOT NULL",
}
def source_articles(conn: sqlite3.Connection, source_id: int, filter: str = "all",
limit: int = 25, offset: int = 0) -> list[dict]:
"""The actual ingested articles for a source, newest first — so admins can
verify the metric (paywall/image/acceptance) against real evidence."""
where = _SRC_ART_FILTERS.get(filter, "")
rows = conn.execute(
f"""
SELECT a.id, a.title, a.canonical_url, a.published_at, a.discovered_at,
a.image_url, a.duplicate_of,
s.accepted, s.reason_code, s.reason_text, s.topic, s.flavor
FROM articles a
LEFT JOIN article_scores s ON s.article_id = a.id
WHERE a.source_id = ? {where}
ORDER BY COALESCE(a.published_at, a.discovered_at) DESC
LIMIT ? OFFSET ?
""",
(source_id, limit, offset),
).fetchall()
return [
{
"id": r["id"],
"title": r["title"],
"url": r["canonical_url"],
"published_at": r["published_at"] or r["discovered_at"],
"accepted": r["accepted"],
"reason": r["reason_text"] or r["reason_code"], # the "why" behind accept/reject
"topic": r["topic"],
"flavor": r["flavor"],
"paywalled": is_paywalled(r["canonical_url"]), # domain rule — same for the source
"has_image": bool(r["image_url"]),
"duplicate": r["duplicate_of"] is not None,
}
for r in rows
]
def source_articles_summary(conn: sqlite3.Connection, source_id: int) -> dict:
"""Counts behind the table metrics + the source-level paywall rule, so the
panel header reads e.g. '120 · 96 accepted · 24 rejected · 3 no image · paywall: ON'."""
agg = conn.execute(
"""
SELECT COUNT(*) total,
COALESCE(SUM(s.accepted = 1), 0) accepted,
COALESCE(SUM(s.accepted = 0), 0) rejected,
COALESCE(SUM(a.image_url IS NULL OR a.image_url = ''), 0) no_image,
COALESCE(SUM(a.duplicate_of IS NOT NULL), 0) duplicates
FROM articles a LEFT JOIN article_scores s ON s.article_id = a.id
WHERE a.source_id = ?
""",
(source_id,),
).fetchone()
one = conn.execute("SELECT canonical_url FROM articles WHERE source_id = ? LIMIT 1", (source_id,)).fetchone()
return {
"total": agg["total"], "accepted": agg["accepted"], "rejected": agg["rejected"],
"no_image": agg["no_image"], "duplicates": agg["duplicates"],
"paywalled": is_paywalled(one["canonical_url"]) if one else False,
}
def admin_stats(conn: sqlite3.Connection, days: int = 30) -> dict: def admin_stats(conn: sqlite3.Connection, days: int = 30) -> dict:
"""Aggregate, non-personal usage stats for the admin dashboard.""" """Aggregate, non-personal usage stats for the admin dashboard."""
since = f"-{days} days" since = f"-{days} days"
+15
View File
@@ -503,3 +503,18 @@ def test_wordsearch_theme_admin(tmp_path, monkeypatch):
# remove # remove
left = tc.delete(f"/api/admin/wordsearch/themes/{tid}").json() left = tc.delete(f"/api/admin/wordsearch/themes/{tid}").json()
assert not any(t["id"] == tid for t in left) assert not any(t["id"] == tid for t in left)
def test_source_articles_inspector(tmp_path, monkeypatch):
app, api = _make(tmp_path, monkeypatch, admin_email="boss@x.com")
assert TestClient(app).get("/api/admin/sources/1/articles").status_code == 401 # gated
tc = _signin(app, api, "boss@x.com")
r = tc.get("/api/admin/sources/1/articles").json()
assert r["summary"]["total"] == 1 and r["summary"]["accepted"] == 1 and r["summary"]["no_image"] == 1
assert len(r["articles"]) == 1
a = r["articles"][0]
assert a["title"] == "t1" and a["accepted"] == 1 and a["has_image"] is False and a["paywalled"] is False
# filters resolve in SQL; rejected → none (the seeded article is accepted)
assert tc.get("/api/admin/sources/1/articles?filter=rejected").json()["articles"] == []
assert len(tc.get("/api/admin/sources/1/articles?filter=no_image").json()["articles"]) == 1
assert tc.get("/api/admin/sources/999/articles").status_code == 404 # unknown source
+17 -11
View File
@@ -17,11 +17,13 @@ def conn(tmp_path):
# --- merge logic (the audited core) --- # --- merge logic (the audited core) ---
def test_merge_wordsearch_unions_finds(): def test_merge_wordsearch_unions_finds():
a = {"foundWords": [{"word": "CAT", "cells": [[0, 0]], "ci": 0}], "startTime": 100, "ms": 0} a = {"foundWords": [{"word": "CAT", "cells": [[0, 0]], "ci": 0}], "played": 9000, "ms": 0}
b = {"foundWords": [{"word": "DOG", "cells": [[1, 1]], "ci": 1}], "startTime": 50, "ms": 0} b = {"foundWords": [{"word": "DOG", "cells": [[1, 1]], "ci": 1}], "played": 4000, "ms": 0}
m = games.merge_game_state("wordsearch", a, b) m = games.merge_game_state("wordsearch", a, b)
assert {f["word"] for f in m["foundWords"]} == {"CAT", "DOG"} # union of finds assert {f["word"] for f in m["foundWords"]} == {"CAT", "DOG"} # union of finds
assert m["startTime"] == 50 # earliest start # active-time clock: the device that banked the most play time is the truth —
# wall-clock gaps between sittings must never inflate the timer
assert m["played"] == 9000
def test_merge_wordsearch_dedupes_and_keeps_best_time(): def test_merge_wordsearch_dedupes_and_keeps_best_time():
@@ -55,12 +57,12 @@ def _find(word, row): # a shape-valid find: cells spelling the word along a row
def test_save_converges_across_devices(conn): def test_save_converges_across_devices(conn):
# No stored puzzle for this date → shape-only sanitize (words 4-12, cells match). # No stored puzzle for this date → shape-only sanitize (words 4-12, cells match).
games.save_game_state(conn, 1, "wordsearch", "small", "2026-06-12", games.save_game_state(conn, 1, "wordsearch", "small", "2026-06-12",
{"foundWords": [_find("BEACH", 0)], "startTime": 100}) {"foundWords": [_find("BEACH", 0)], "played": 100})
merged = games.save_game_state(conn, 1, "wordsearch", "small", "2026-06-12", merged = games.save_game_state(conn, 1, "wordsearch", "small", "2026-06-12",
{"foundWords": [_find("OCEAN", 1)], "startTime": 50}) {"foundWords": [_find("OCEAN", 1)], "played": 50})
assert {f["word"] for f in merged["foundWords"]} == {"BEACH", "OCEAN"} assert {f["word"] for f in merged["foundWords"]} == {"BEACH", "OCEAN"}
# stored state reflects the merge (order-independent) # stored state reflects the merge (order-independent): most banked time wins
assert games.load_game_state(conn, 1, "wordsearch", "small", "2026-06-12")["startTime"] == 50 assert games.load_game_state(conn, 1, "wordsearch", "small", "2026-06-12")["played"] == 100
# --- derived stats --- # --- derived stats ---
@@ -96,15 +98,15 @@ def test_game_state_api_roundtrip(tmp_path, monkeypatch):
# signed out → no sync, echoes the posted state and GET sees nothing stored # signed out → no sync, echoes the posted state and GET sees nothing stored
anon = TestClient(app) anon = TestClient(app)
body = {"game": "wordsearch", "variant": "small", "date": "2026-06-12", body = {"game": "wordsearch", "variant": "small", "date": "2026-06-12",
"state": {"foundWords": [_find("BEACH", 0)], "startTime": 9}} "state": {"foundWords": [_find("BEACH", 0)], "played": 9}}
assert anon.put("/api/games/state", json=body).json()["state"]["foundWords"][0]["word"] == "BEACH" assert anon.put("/api/games/state", json=body).json()["state"]["foundWords"][0]["word"] == "BEACH"
assert anon.get("/api/games/state?game=wordsearch&variant=small&date=2026-06-12").json()["state"] is None assert anon.get("/api/games/state?game=wordsearch&variant=small&date=2026-06-12").json()["state"] is None
# signed in: push from "device A", then "device B" → server returns the union # signed in: push from "device A", then "device B" → server returns the union
tc = _signin(app, api, "p@x.com") tc = _signin(app, api, "p@x.com")
tc.put("/api/games/state", json=body) tc.put("/api/games/state", json=body)
bodyB = {**body, "state": {"foundWords": [_find("OCEAN", 1)], "startTime": 4}} bodyB = {**body, "state": {"foundWords": [_find("OCEAN", 1)], "played": 4}}
merged = tc.put("/api/games/state", json=bodyB).json()["state"] merged = tc.put("/api/games/state", json=bodyB).json()["state"]
assert {f["word"] for f in merged["foundWords"]} == {"BEACH", "OCEAN"} and merged["startTime"] == 4 assert {f["word"] for f in merged["foundWords"]} == {"BEACH", "OCEAN"} and merged["played"] == 9
# GET returns the stored merge; unknown game → 404; bad date → 400 # GET returns the stored merge; unknown game → 404; bad date → 400
got = tc.get("/api/games/state?game=wordsearch&variant=small&date=2026-06-12").json()["state"] got = tc.get("/api/games/state?game=wordsearch&variant=small&date=2026-06-12").json()["state"]
assert {f["word"] for f in got["foundWords"]} == {"BEACH", "OCEAN"} assert {f["word"] for f in got["foundWords"]} == {"BEACH", "OCEAN"}
@@ -123,5 +125,9 @@ def test_sanitizers_reject_junk(conn):
"foundWords": [_find("BEACH", 0), # ok "foundWords": [_find("BEACH", 0), # ok
{"word": "CAT", "cells": [[0, 0], [0, 1], [0, 2]]}, # too short (<4) {"word": "CAT", "cells": [[0, 0], [0, 1], [0, 2]]}, # too short (<4)
{"word": "OCEAN", "cells": [[1, 0], [1, 1]]}], # cells != len {"word": "OCEAN", "cells": [[1, 0], [1, 1]]}], # cells != len
"ms": 12345}) "ms": 12345, "played": -5})
assert [f["word"] for f in ws["foundWords"]] == ["BEACH"] and ws["ms"] == 0 assert [f["word"] for f in ws["foundWords"]] == ["BEACH"] and ws["ms"] == 0
assert ws["played"] == 0 # negative junk clamped
# absurd active-time claims are capped at a day
capped = games._sanitize_wordsearch(conn, "small", "2026-06-12", {"foundWords": [], "played": 10**12})
assert capped["played"] == 86_400_000
+78
View File
@@ -0,0 +1,78 @@
"""Locks the word-search placement qualities players actually feel:
1. Every word gets placed (exhaustive candidate search nothing silently dropped).
2. Grids INTERLOCK like a real puzzle (the "clean isolated words" regression).
3. Words SPREAD across the board (the "all clumped in one corner" regression).
4. Same date/seed same grid (cross-device players must see identical puzzles).
Thresholds were calibrated against all curated themes × 12 seeds × 3 tiers
(288 grids/tier): crossing fraction averaged ~0.7 (old algorithm: ~0.3, with a
third of small grids having ZERO crossings), worst quadrant share 0.42, and all
four quadrants always held word cells. Deterministic, so no flake margin needed.
"""
import random
import statistics
from goodnews.games import _WS_FALLBACKS, WS_TIERS, _WS_ORDER, _build_grid, _place_words, _zone
def _tier_grids(tier):
"""Yield (placements, size) for every curated theme × 12 seeds in a tier."""
t = WS_TIERS[tier]
for _, words in _WS_FALLBACKS:
for seed in range(12):
rng = random.Random(seed * 1000 + 7)
ws = list(words)
rng.shuffle(ws)
_, placements = _place_words(ws[: t["count"]], t["grid"], seed)
yield placements, t["grid"]
def _cross_fraction(placements):
"""Fraction of placed words sharing at least one cell with another word."""
owners: dict[tuple[int, int], list[str]] = {}
for w, cells in placements:
for cell in cells:
owners.setdefault(cell, []).append(w)
crossing = set()
for ws in owners.values():
if len(ws) > 1:
crossing.update(ws)
return len(crossing) / len(placements)
def test_all_words_placed():
for tier in _WS_ORDER:
for placements, _ in _tier_grids(tier):
assert len(placements) == WS_TIERS[tier]["count"]
def test_grids_interlock_without_clumping():
for tier in _WS_ORDER:
fracs = []
for placements, size in _tier_grids(tier):
fracs.append(_cross_fraction(placements))
# Spread: word cells must reach all four quadrants, and no quadrant
# may hoard more than half of them (perfectly even would be 0.25).
quad: dict[tuple[int, int], int] = {}
cells = {c for _, cs in placements for c in cs}
for r, c in cells:
quad[_zone(r, c, size)] = quad.get(_zone(r, c, size), 0) + 1
assert len(quad) == 4, f"{tier}: words confined to {len(quad)} quadrant(s)"
assert max(quad.values()) / len(cells) <= 0.5, f"{tier}: clumped in one quadrant"
# Interlock: every grid has some crossings; on average most words connect.
assert min(fracs) >= 0.3, f"{tier}: a grid came out as disconnected clean words"
assert 0.55 <= statistics.mean(fracs) <= 0.9, f"{tier}: avg crossing {statistics.mean(fracs):.2f}"
def test_grid_deterministic_and_honest():
"""Same inputs → byte-identical grid, and every reported word is really in it
(forward or reversed along some line spot-checked via placements)."""
words = _WS_FALLBACKS[0][1][:9]
rows1, placed1 = _build_grid(words, 11, 42)
rows2, placed2 = _build_grid(words, 11, 42)
assert rows1 == rows2 and placed1 == placed2
_, placements = _place_words(words, 11, 42)
for word, cells in placements:
assert "".join(rows1[r][c] for r, c in cells) == word