89c0fbe1f6
The deploy pipeline runs from the working tree, so a wave of shipped features
had never been committed. This snapshots git to what's actually running.
SEO impression recovery (live + verified):
- Duplicate /a/{id} now 301-redirect to their canonical twin instead of 404
(a hard 404 silently dropped already-indexed URLs and tanked impressions).
- Dedup representative selection reworked: accepted/serveable -> established
rep (URL stability) -> quality score, so an accepted page never retires to a
rejected rep and an indexed canonical doesn't churn when a newer twin arrives.
- HEAD /a/{id} returns the same status as GET (api_route GET+HEAD) instead of
falling through to the static mount and 404ing.
- `dedup --force-recluster`: cycle-locked, model-free re-cluster to re-apply the
policy to the existing corpus (shared cycle_lock context manager).
- CLI honors GOODNEWS_DB for its default --db (was silently ignored).
Publishing Desk (admin tool to post highlights to X via Web Intents):
- publishing.py queue/rank/handle-resolution; admin UI; full searchable emoji
picker (bundled data, no CDN) for the blurb editor.
Play games + site:
- Bloom (word-wheel), Memory Match, daily ritual set, Zen Den (dev-gated).
- English-only language gate; source prospecting; paywall + dedup hardening.
Tests: full suite green (349). Ignores tightened (node_modules, data/*.db).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
7.1 KiB
7.1 KiB
| 1 | Status | Bucket | Source | Feed_URL | Homepage | Country/Region | Scope | Lane | Notes |
|---|---|---|---|---|---|---|---|---|---|
| 2 | VERIFIED | universities/research | MIT News – Research | https://news.mit.edu/rss/research | https://news.mit.edu | US | national | science/tech | topic feed; primary research |
| 3 | VERIFIED | universities/research | MIT News – Environment | https://news.mit.edu/rss/topic/environment | https://news.mit.edu | US | national | environment | topic feed (climate/energy) |
| 4 | VERIFIED | universities/research | UC Berkeley News | https://news.berkeley.edu/feed/ | https://news.berkeley.edu | US | national | research | broad research; geo-taggable US |
| 5 | VERIFIED | universities/research | UW News | https://www.washington.edu/news/feed/ | https://www.washington.edu/news/ | US (PNW) | national/regional | research | Univ of Washington; PNW flavor |
| 6 | VERIFIED | universities/research | Harvard Gazette | https://news.harvard.edu/gazette/feed/ | https://news.harvard.edu/gazette/ | US | national | research/health | strong health/medicine + science |
| 7 | VERIFIED | universities/research | Johns Hopkins Hub | https://hub.jhu.edu/feed/ | https://hub.jhu.edu | US | national | research/health | medical + research breadth |
| 8 | VERIFIED | gov labs/agencies | NIST News | https://www.nist.gov/news-events/news/rss.xml | https://www.nist.gov | US (gov) | national | science/tech | clean RSS (rare for gov) |
| 9 | VERIFIED | gov labs/agencies | NSF News | https://www.nsf.gov/rss/rss_www_news.xml | https://www.nsf.gov | US (gov) | national | science | funded discoveries, many fields |
| 10 | VERIFIED | science | Knowable Magazine | https://knowablemagazine.org/rss | https://knowablemagazine.org | US | global | science | explanatory, low-hype |
| 11 | VERIFIED | science | Nautilus | https://nautil.us/feed/ | https://nautil.us | US | global | science/culture | thoughtful long-form science |
| 12 | VERIFIED | science | Undark | https://undark.org/feed/ | https://undark.org | US | global | science | nonprofit (MIT KSJ) |
| 13 | VERIFIED | conservation/env | Inside Climate News | https://insideclimatenews.org/feed/ | https://insideclimatenews.org | US | national | environment/climate | Pulitzer nonprofit; solutions-leaning |
| 14 | VERIFIED | conservation/env | Canary Media | https://www.canarymedia.com/articles.rss | https://www.canarymedia.com | US | national | energy/clean-tech | clean-energy transition wins |
| 15 | VERIFIED | conservation/env | Cool Green Science (TNC) | https://blog.nature.org/feed/ | https://blog.nature.org | US/global | global | conservation | TAG ORG/advocacy (Nature Conservancy) |
| 16 | VERIFIED | conservation/env | Yale Climate Connections | https://yaleclimateconnections.org/feed/ | https://yaleclimateconnections.org | US | national | environment/climate | solutions + adaptation |
| 17 | VERIFIED | constructive/solutions | YES! Magazine | https://www.yesmagazine.org/feed | https://www.yesmagazine.org | US | national | solutions | core solutions journalism |
| 18 | VERIFIED | constructive/solutions | Christian Science Monitor – Science | https://rss.csmonitor.com/feeds/science | https://www.csmonitor.com | US | global | science | TOPIC feed (avoids CSM politics) |
| 19 | VERIFIED | regional | High Country News | https://www.hcn.org/feed | https://www.hcn.org | US – West | regional | environment/community | regional flavor for Closer To Home |
| 20 | VERIFIED | regional | Sightline Institute | https://www.sightline.org/feed/ | https://www.sightline.org | US – Pacific NW | regional | policy/sustainability | TAG ORG; PNW solutions |
| 21 | VERIFIED | community | Strong Towns | https://www.strongtowns.org/journal?format=rss | https://www.strongtowns.org | US | national | community/urbanism | TAG ORG; local-repair framing |
| 22 | VERIFIED | constructive/global | The Better India | https://www.thebetterindia.com/feed/ | https://www.thebetterindia.com | India | global (non-US) | constructive | non-US breadth; good-news native |
| 23 | VERIFIED | health | NPR Goats and Soda | https://feeds.npr.org/1039/rss.xml | https://www.npr.org/sections/goatsandsoda/ | US->global | global | health/development | global health & development |
| 24 | VERIFIED | health | NPR Shots | https://feeds.npr.org/1128/rss.xml | https://www.npr.org/sections/health-shots/ | US | national | health | CHECK overlap w/ existing 'NPR Health' |
| 25 | VERIFIED | education | Chalkbeat | https://www.chalkbeat.org/arc/outboundfeeds/rss/ | https://www.chalkbeat.org | US | national | education | education reform/wins |
| 26 | VERIFIED | education | Hechinger Report | https://hechingerreport.org/feed/ | https://hechingerreport.org | US | national | education | nonprofit education coverage |
| 27 | VERIFIED | education | EdSurge | https://www.edsurge.com/articles_rss | https://www.edsurge.com | US | national | education/ed-tech | ed-tech + learning |
| 28 | BOT-BLOCKED(403) | health institutions | Mayo Clinic News Network | https://newsnetwork.mayoclinic.org/feed/ | https://newsnetwork.mayoclinic.org | US | national | health | feed exists; 403 to bots |
| 29 | BOT-BLOCKED(403) | universities/research | Stanford News | https://news.stanford.edu/feed/ | https://news.stanford.edu | US | national | research | feed exists; 403 to bots |
| 30 | BOT-BLOCKED(403) | food/community | Civil Eats | https://civileats.com/feed/ | https://civileats.com | US | national | food-systems/community | feed exists; 403 to bots |
| 31 | DISCOVERY-PHASE | health institutions | Cleveland Clinic Newsroom | (unresolved) | https://newsroom.clevelandclinic.org | US | national | health | RSS path unresolved |
| 32 | DISCOVERY-PHASE | gov labs/agencies | NIH | (no clean RSS) | https://www.nih.gov/news-events | US (gov) | national | health | gov dropped clean RSS |
| 33 | DISCOVERY-PHASE | gov labs/agencies | NOAA | (no clean RSS) | https://www.noaa.gov | US (gov) | national | environment/climate | gov dropped clean RSS |
| 34 | DISCOVERY-PHASE | gov labs/agencies | NOAA Climate.gov | (no clean RSS) | https://www.climate.gov | US (gov) | national | climate | gov dropped clean RSS |
| 35 | DISCOVERY-PHASE | gov labs/agencies | DOE Office of Science | (no clean RSS) | https://www.energy.gov/science | US (gov) | national | energy/science | gov dropped clean RSS |
| 36 | DISCOVERY-PHASE | gov labs/agencies | NREL | (timeout/none) | https://www.nrel.gov/news | US (gov) | national | energy | no reachable RSS |
| 37 | DISCOVERY-PHASE | gov labs/agencies | USGS | (no clean RSS) | https://www.usgs.gov/news | US (gov) | national | science/environment | gov dropped clean RSS |
| 38 | DISCOVERY-PHASE | gov labs/agencies | EPA | (no clean RSS) | https://www.epa.gov/newsreleases | US (gov) | national | environment | gov dropped clean RSS |
| 39 | DISCOVERY-PHASE | research wire | EurekAlert | (RSS gated) | https://www.eurekalert.org | US/global | global | science/health | RSS gated; needs API/discovery |
| 40 | DISCOVERY-PHASE | community | Next City | (rss not a feed) | https://nextcity.org | US | national | urban-solutions | feed endpoint returns non-feed |
| 41 | DISCOVERY-PHASE | education | Edutopia | (rss not a feed) | https://www.edutopia.org | US | national | education | feed endpoint returns non-feed |
| 42 | DISCOVERY-PHASE | constructive | Fix The News | (404) | https://fixthenews.com | global | global | constructive | likely Substack feed; find URL |
| 43 | DEMOTE-WATCH | existing (paywall) | Nature News | (existing source) | https://www.nature.com/news | UK/global | global | science | 100% paywall; prefer-accessible/demote-if-persistent |
| 44 | DEMOTE-WATCH | existing (paywall) | New Scientist | (existing source) | https://www.newscientist.com | UK/global | global | science | 100% paywall |
| 45 | DEMOTE-WATCH | existing (paywall) | MIT Technology Review | (existing source) | https://www.technologyreview.com | US | global | technology | 100% paywall |
| 46 | DEMOTE-WATCH | existing (paywall) | NY Times Learning | (existing source) | https://www.nytimes.com/section/learning | US | national | education | 100% paywall |
| 47 | DEMOTE-WATCH | existing (0% access) | Guardian Science | (existing source) | https://www.theguardian.com/science | UK/global | global | science | 0% accessible in metrics |
| 48 | DEMOTE-WATCH | existing (0% access) | Guardian Environment | (existing source) | https://www.theguardian.com/environment | UK/global | global | environment | 0% accessible in metrics |