Methodology | The Reading Room

Everything here passed a manual filter. The algorithm surfaces it; I decide whether it belongs. What gets published reflects what I think is worth thinking about -- not what performed well or what arrived most recently.

The X Pipeline

The following describes how content from X (Twitter) is discovered, processed, and published. The papers pipeline is separate -- papers are added manually and annotated in a distinct format.

Discovery

Encountered in the feed

X surfaces the content. I read it, decide it's worth keeping, and bookmark it. Nothing enters the pipeline without that manual decision -- the algorithm is just the delivery mechanism.

Clipping

Obsidian Reader extension

One click with the Obsidian Reader browser extension clips the tweet and its full reply thread to my local Obsidian vault, filed under a Tweets label. The full thread context -- not just the top-level post -- is preserved as markdown.

Storage

Local Obsidian vault

Obsidian stores each clip as a structured markdown file in a local directory. The files accumulate there until the next scheduled ingest. Nothing is sent anywhere until Claude picks them up.

Detection

Claude checks for new files daily

A Claude scheduled task runs each day. It calls a parser that compares files in the Obsidian Tweets directory against the existing tweets.json and outputs only entries that haven't been ingested yet. If nothing is new, it stops there.

Validation

Parser checks for categorical mistakes

Before enrichment, each parsed entry is checked against a set of known failure patterns: author handle vs. source URL mismatches, reply text leaking into the main tweet slot, raw image markdown surviving the filter, and duplicate source URLs. When a new failure is found, Claude traces it to the source file, patches the parser to prevent recurrence, and appends the pattern to a persistent issues log. The pipeline accumulates its own institutional memory -- each mistake caught makes the next run more reliable.

Enrichment

Claude annotates each entry

For each new tweet, Claude reads the full thread and generates four things: a commentary (2-3 sentences on why it matters from a research or policy lens), a plain-language version of the same take, a set of tags for granular filtering, and two classification fields -- sector (the subject domain) and source type (Official, Company, or Individual). Existing entries are used as labeled examples so classifications stay consistent over time.

Synthesis

Takeaways regenerated from the full corpus

After ingesting new entries, Claude reads everything -- all tweets and papers -- and rewrites the Takeaways section from scratch. This is not a summary of new additions. It's a synthesis of the argument the whole collection is making at that moment. As the corpus grows and diversifies, the takeaways shift to reflect what the reading list is actually saying collectively.

Deployment

Pushed to Cloudflare Pages

Claude appends the enriched entries to tweets.json, writes the updated takeaways.json, and deploys the full site to Cloudflare Pages via Wrangler. The UI is a static site -- no server, no database. The data files are the source of truth.

Classification

Each entry is tagged along three dimensions: tags for granular topic filtering, sector for the subject domain, and source type for who produced it. These drive the sort and filter controls on the main page.

Classification decisions are grounded in the existing corpus -- Claude uses prior entries as labeled examples before assigning fields to new ones. The test for sector is: what is the primary reason this is interesting? The test for source type is: is this an institution speaking officially, or a person speaking for themselves?

Official

Heads of state, ministers, regulators, military leadership posting in their institutional capacity.

Company

Corporate or institutional accounts speaking as the organization -- product announcements, research releases, policy statements.

Individual

Researchers, founders, investors, and commentators posting their own views. Applies even when the person runs a major company.

Academic

Peer-reviewed papers and formal research. Used for the Papers section only, not tweets.

Takeaways

The Takeaways section is not an index of recent additions. It's Claude's attempt to answer: what is the argument this collection is making right now? Each theme spans multiple entries and is grounded by links to the specific tweets and papers that support it.

Takeaways are regenerated in full on every ingest run. They are not cached or manually edited. As the corpus grows into new domains -- capital markets, philosophy, geopolitics -- the synthesis will shift to reflect what the reading list is actually engaging with at any given time.