Installation
Install ccrawl from a release, with go install, or from source. DuckDB is optional.
Prebuilt binaries
Every release carries archives
for Linux, macOS, and Windows on amd64 and arm64, plus deb, rpm, and apk
packages for Linux. Download, unpack, put ccrawl on your PATH, done. The
checksums.txt on each release is signed with keyless
cosign if you want to verify before running.
With Go
go install github.com/tamnd/ccrawl-cli/cmd/ccrawl@latest
That puts ccrawl in $(go env GOPATH)/bin, which is ~/go/bin unless you
moved it. Make sure that directory is on your PATH.
From source
git clone https://github.com/tamnd/ccrawl-cli
cd ccrawl-cli
make build # produces ./bin/ccrawl
./bin/ccrawl version
Optional: DuckDB
The columnar index commands (ccrawl table, ccrawl db) run SQL against the
public Parquet index. If a duckdb binary is on your PATH, ccrawl uses it to
run the queries directly. If it is not, ccrawl prints the SQL so you can paste
it into DuckDB, Athena, Spark, or Trino yourself. Either way the ccrawl binary
never links DuckDB, so the install stays small and pure Go.
Install DuckDB from duckdb.org if you want local execution. Everything else in ccrawl works without it.
Requirements
- Go 1.26 or later to build. The released binary has no Go requirement.
- A
duckdbbinary only if you want to run columnar queries locally.
That is the whole list. No config file, no database to provision, no daemon.
Checking the install
ccrawl version
prints the version and exits. Then confirm it can reach Common Crawl:
ccrawl crawls latest
should print the newest crawl ID, for example CC-MAIN-2026-21. If you see
that, you are ready for the quick start.