Extract PostgreSQL records with all related data via FK relationships
Project description
PgSlice
Bump only what you need
Python CLI tool for extracting PostgreSQL records with all related data via foreign key relationships.
Overview
pgslice extracts a specific database record and ALL its related records by following foreign key relationships bidirectionally. Perfect for:
- Reproducing production bugs locally with real data
- Creating partial database dumps for specific users/entities
- Testing with realistic data subsets
- Debugging issues that only occur with specific data states
Extract only what you need while maintaining referential integrity.
Features
- ✅ CLI-first design: Stream SQL to stdout for easy piping and scripting
- ✅ Bidirectional FK traversal: Follows relationships in both directions (forward and reverse)
- ✅ Circular relationship handling: Prevents infinite loops with visited tracking
- ✅ Multiple records: Extract multiple records in one operation
- ✅ Timeframe filtering: Filter specific tables by date ranges
- ✅ PK remapping: Auto-remaps auto-generated primary keys for clean imports
- ✅ DDL generation: Optionally include CREATE DATABASE/SCHEMA/TABLE statements for self-contained dumps
- ✅ Progress bar: Visual progress indicator for dump operations
- ✅ Schema caching: SQLite-based caching for improved performance
- ✅ Type-safe: Full type hints with mypy strict mode
- ✅ Secure: SQL injection prevention, secure password handling
Installation
From PyPI (Recommended)
# Install with pipx (isolated environment, recommended)
pipx install pgslice
# Or with pip
pip install pgslice
# Or with uv
uv tool install pgslice
From Docker Hub
# Pull the image
docker pull edraobdu/pgslice:latest
# Run pgslice
docker run --rm -it \
-v $(pwd)/dumps:/home/pgslice/.pgslice/dumps \
-e PGPASSWORD=your_password \
edraobdu/pgslice:latest \
pgslice --host your.db.host --port 5432 --user your_user --database your_db
# Pin to specific version
docker pull edraobdu/pgslice:0.1.1
# Use specific platform
docker pull --platform linux/amd64 edraobdu/pgslice:latest
From Source (Development)
See DEVELOPMENT.md for detailed development setup instructions.
Quick Start
CLI Mode
The CLI mode streams SQL to stdout by default, making it easy to pipe or redirect output:
# Basic dump to stdout (pipe to file)
PGPASSWORD=xxx pgslice --host localhost --database mydb --table users --pks 42 > user_42.sql
# Multiple records
PGPASSWORD=xxx pgslice --host localhost --database mydb --table users --pks 1,2,3 > users.sql
# Output directly to file with --output flag
pgslice --host localhost --database mydb --table users --pks 42 --output user_42.sql
# Dump by timeframe (instead of PKs) - filters main table by date range
pgslice --host localhost --database mydb --table orders \
--timeframe "created_at:2024-01-01:2024-12-31" > orders_2024.sql
# Wide mode: follow all relationships including self-referencing FKs
# Be cautious - this can result in larger datasets
pgslice --host localhost --database mydb --table customer --pks 42 --wide > customer.sql
# Keep original primary keys (no remapping)
pgslice --host localhost --database mydb --table film --pks 1 --keep-pks > film.sql
# Generate self-contained SQL with DDL statements
# Includes CREATE DATABASE/SCHEMA/TABLE statements
pgslice --host localhost --database mydb --table film --pks 1 --create-schema > film_complete.sql
# Apply truncate filter to limit related tables by date range
pgslice --host localhost --database mydb --table customer --pks 42 \
--truncate "rental:rental_date:2024-01-01:2024-12-31" > customer.sql
# Enable debug logging (writes to stderr)
pgslice --host localhost --database mydb --table users --pks 42 \
--log-level DEBUG 2>debug.log > output.sql
Schema Exploration
# List all tables in the schema
pgslice --host localhost --database mydb --tables
# Describe table structure and relationships
pgslice --host localhost --database mydb --describe users
SSH Remote Execution
Run pgslice on a remote server and capture output locally:
# Execute on remote server, save output locally
ssh remote.server.com "PGPASSWORD=xxx pgslice --host db.internal --database mydb \
--table users --pks 1 --create-schema" > local_dump.sql
# With SSH tunnel for database access
ssh -f -N -L 5433:db.internal:5432 bastion.example.com
PGPASSWORD=xxx pgslice --host localhost --port 5433 --database mydb \
--table users --pks 42 > user.sql
Interactive REPL
# Start interactive REPL
pgslice --host localhost --database mydb
pgslice> dump "film" 1 --output film_1.sql
pgslice> tables
pgslice> describe "film"
CLI vs REPL: Output Behavior
Understanding the difference between CLI and REPL modes:
CLI Mode (stdout by default)
The CLI streams SQL to stdout by default, perfect for piping and scripting:
# Streams to stdout - redirect with >
pgslice --table users --pks 42 > user_42.sql
# Or use --output flag
pgslice --table users --pks 42 --output user_42.sql
# Pipe to other commands
pgslice --table users --pks 42 | gzip > user_42.sql.gz
REPL Mode (files by default)
The REPL writes to ~/.pgslice/dumps/ by default when --output is not specified:
# In REPL: writes to ~/.pgslice/dumps/public_users_42.sql
pgslice> dump "users" 42
# Specify custom output path
pgslice> dump "users" 42 --output /path/to/user.sql
Same Operations, Different Modes
| Operation | CLI | REPL |
|---|---|---|
| List tables | pgslice --tables |
pgslice> tables |
| Describe table | pgslice --describe users |
pgslice> describe "users" |
| Dump to stdout | pgslice --table users --pks 42 |
N/A (REPL always writes to file) |
| Dump to file | pgslice --table users --pks 42 --output user.sql |
pgslice> dump "users" 42 --output user.sql |
| Dump (default) | Stdout | ~/.pgslice/dumps/public_users_42.sql |
| Multiple PKs | pgslice --table users --pks 1,2,3 |
pgslice> dump "users" 1,2,3 |
| Truncate filter | pgslice --table users --pks 42 --truncate "orders:2024-01-01:2024-12-31" |
pgslice> dump "users" 42 --truncate "orders:2024-01-01:2024-12-31" |
| Wide mode | pgslice --table users --pks 42 --wide |
pgslice> dump "users" 42 --wide |
When to Use Each Mode
Use CLI mode when:
- Piping output to other commands
- Scripting and automation
- Remote execution via SSH
- One-off dumps
Use REPL mode when:
- Exploring database schema interactively
- Running multiple dumps in a session
- You prefer persistent file output
- Testing different dump configurations
Configuration
Key environment variables (see .env.example for full reference):
| Variable | Description | Default |
|---|---|---|
DB_HOST |
Database host | localhost |
DB_PORT |
Database port | 5432 |
DB_NAME |
Database name | - |
DB_USER |
Database user | - |
DB_SCHEMA |
Schema to use | public |
PGPASSWORD |
Database password (env var only) | - |
CACHE_ENABLED |
Enable schema caching | true |
CACHE_TTL_HOURS |
Cache time-to-live | 24 |
LOG_LEVEL |
Logging level (disabled by default unless specified) | disabled |
PGSLICE_OUTPUT_DIR |
Output directory | ~/.pgslice/dumps |
Security
- ✅ Parameterized queries: All SQL uses proper parameterization
- ✅ SQL injection prevention: Identifier validation
- ✅ Secure passwords: Never logged or stored
- ✅ Read-only enforcement: Safe for production databases
Contributing
Contributions are welcome! See DEVELOPMENT.md for comprehensive development documentation including:
- Local development setup
- Code quality standards and testing guidelines
- Version management and publishing workflow
- Architecture and design patterns
Quick start for contributors:
make setup # One-time setup (installs dependencies, hooks)
make test # Run all tests
git commit # Pre-commit hooks run automatically (linting, formatting, type-checking)
For troubleshooting common development issues, see the Troubleshooting section in DEVELOPMENT.md.
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pgslice-0.2.0.tar.gz.
File metadata
- Download URL: pgslice-0.2.0.tar.gz
- Upload date:
- Size: 53.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8ad2ad8378b8f5cc3f4044e3a154e369f3224fafe14561af67309719dbf889a2
|
|
| MD5 |
ec5d77f2239b91146421c82f55c2615e
|
|
| BLAKE2b-256 |
7276c8999e9c8f4e16f44f67d05866308a806cef869a1cd2a6cbe10c7347e2a0
|
File details
Details for the file pgslice-0.2.0-py3-none-any.whl.
File metadata
- Download URL: pgslice-0.2.0-py3-none-any.whl
- Upload date:
- Size: 59.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.18 {"installer":{"name":"uv","version":"0.9.18","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
66ddab3cf58570267ebdedf922657f4f5502290de753469beed4befb025675e6
|
|
| MD5 |
bd0f3be731ea7dec32d5b4563da2c9ef
|
|
| BLAKE2b-256 |
fb15f1701019a3af9f12be456ac9c7ae8658b60962366a6bdad55a241bc7a515
|