v0.2.1
New guides for all v2 command groups.
Released 2026-06-18
v0.2.1 is a documentation release. No binary changes were made.
New guides
Six new task-oriented guides cover the command groups introduced in v0.2.0:
- Host graph and enrichment — enumerate 262M hosts, join web-graph topology, aggregate per-host CDX statistics.
- Building a recrawl engine — seed from the CC host list, fetch live pages, and write WARC or JSONL output.
- Recrawl scheduling — score URLs by harmonic rank and change rate, diff two CDX snapshots to find changed pages.
- Building a search index — build a BM25 inverted index from a page corpus and query it in milliseconds.
- Content signals — extract text, measure quality, and map outlinks from live pages or stored WARCs.
- API server — serve a local index over HTTP with four REST endpoints.
Documentation fixes
- Quick-start "Where to next" section now links to all six new guides.
- Guides index description updated to cover the full v2 command surface.