Skip to main content

A CLI tool to export and import schema definitions and data from CockroachDB in SQL, JSON, YAML, or chunked CSV formats.

Project description

PyPI version Python versions License Build status

crdb-dump

A CLI tool to export and import schema definitions and data from CockroachDB in SQL, JSON, YAML, or chunked CSV formats.

Supports chunking, parallelism, resumability, diffing, manifest checksums, BYTES and UUID types, TLS auth, and dry-run safety.


🚀 Features

  • Export tables, views, sequences, and user-defined types
  • Output formats: SQL, JSON, YAML, CSV (with optional gzip)
  • Export BYTES as decode('<hex>', 'hex')
  • Handles UUIDs, TIMESTAMPS, arrays
  • Create per-table schema files or a unified schema file
  • Parallel + chunked data export with manifest and row tracking
  • Resumable COPY-based data import
  • Schema + data dry-run mode
  • Schema diffing against previous .sql
  • CLI output + logging to logs/
  • TLS certs or insecure connection supported
  • --print-connection shows full resolved DB URL (safe)

🔧 Installation

pip install crdb-dump

🥺 Local Testing

Run an integration test:

./test-local.sh

This performs:

  • Schema + data export (with BYTES, UUID)
  • Chunked CSV manifest creation
  • Dry-run import
  • Full schema and data reload

🗋 Usage

crdb-dump export --db=mydb [options]
crdb-dump load --db=mydb --schema=<.sql> --data-dir=...

🔐 Connection Options

export CRDB_URL="cockroachdb://root@localhost:26257/defaultdb?sslmode=disable"

You can also use:

export CRDB_URL="postgresql://root@localhost:26257/defaultdb?sslmode=disable"

Alternatively, you can specify flags:

--db mydb --host localhost --certs-dir ~/certs

🏠 Export Options

crdb-dump export --db=mydb --data --data-format=csv --chunk-size=1000
Option Description
--data Enable data export
--data-format csv or sql output
--data-compress Output .csv.gz instead
--chunk-size Split into fixed-row chunks
--per-table Write per-table files
--data-order Order rows (e.g., by id)
--data-order-desc Order descending
--data-parallel Export tables in parallel
--verify Check manifest SHA256s
--print-connection Show resolved DB connection URL
--archive Create .tar.gz from exported folder

🛬 Load Options

crdb-dump load \
  --db=mydb \
  --schema=defaultdb_schema.sql \
  --data-dir=export/defaultdb \
  --resume-log=resume.json \
  --print-connection \
  --dry-run
Option Description
--schema Load schema from .sql file
--data-dir Path containing chunked CSVs and manifests
--resume-log Resume tracking file for chunked load
--dry-run Don't execute, just print plan
--include-tables Restrict to specific table names
--exclude-tables Skip specific table names
--print-connection Print resolved CockroachDB connection

📂 Output Structure

By default, output is stored under:

crdb_dump_output/<db_name>/
├── defaultdb_schema.sql
├── table_users.sql
├── users_chunk_001.csv
├── users.manifest.json
├── logins_chunk_001.csv
├── logins.manifest.json

All logs go to:

logs/crdb_dump.log

📄 Example: Full Export + Verify + Import

crdb-dump export \
  --db=defaultdb \
  --data \
  --data-format=csv \
  --chunk-size=1000 \
  --per-table \
  --verify \
  --archive \
  --print-connection

crdb-dump load \
  --db=defaultdb \
  --schema=crdb_dump_output/defaultdb/defaultdb_schema.sql \
  --data-dir=crdb_dump_output/defaultdb \
  --resume-log=resume.json \
  --print-connection

🔍 Schema Diffing

crdb-dump export \
  --db=defaultdb \
  --diff=previous_schema.sql

This prints a unified diff and writes to:

crdb_dump_output/<db_name>/<db_name>_schema.diff

🤖 Test Coverage

  • pytest -m unit – runs fast unit tests
  • pytest -m integration – full Docker-based test
  • ./test-local.sh – end-to-end data roundtrip

🛠️ Developer Notes

  • Configured via pyproject.toml (PEP 621)
  • Click-based CLI
  • Tested with CRDB v25.2
  • CI runs all tests via GitHub Actions and Docker

👤 Author

Created by Virag Tripathi MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crdb_dump-0.2.3.tar.gz (15.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

crdb_dump-0.2.3-py3-none-any.whl (16.6 kB view details)

Uploaded Python 3

File details

Details for the file crdb_dump-0.2.3.tar.gz.

File metadata

  • Download URL: crdb_dump-0.2.3.tar.gz
  • Upload date:
  • Size: 15.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.6

File hashes

Hashes for crdb_dump-0.2.3.tar.gz
Algorithm Hash digest
SHA256 122dd822ddc704cb997be4ac37e48e97cf151d3ba0120b63dbc66e764f9dc8d3
MD5 79ebf402036cce804f8e1c819ce2f636
BLAKE2b-256 fe01bb99fa37fc5e20f2c7a13e0c82a950981bd9139d57239a52b596c1589520

See more details on using hashes here.

File details

Details for the file crdb_dump-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: crdb_dump-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 16.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.6

File hashes

Hashes for crdb_dump-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 da1b2400d7a279b8e973e3d593a1e7e4379cb26a2fb57373a143b3a2efb38197
MD5 444f067c17c98125b7d7772ddd2484ae
BLAKE2b-256 86070b6fd8613c2b9e64f5213229cb307a0e707eff29d2a3efefce90e19614b3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page