Skip to main content

Export SQL query results to Parquet and CSV... and upload to S3 or MinIO

Project description

codecov CI

sqlxport

Modular CLI tool to extract data from PostgreSQL/Redshift and export to various formats (e.g. Parquet, CSV), with optional S3 upload and Athena integration.


✅ Features

  • 🔄 Run custom SQL queries against PostgreSQL or Redshift
  • 📦 Export to Parquet or CSV (--format)
  • 🪣 Upload results to S3 or MinIO
  • 🔄 Redshift UNLOAD support
  • 🧩 Partition output by column
  • 📜 Generate Athena CREATE TABLE DDL
  • 🔍 Preview local or remote Parquet/CSV files
  • ⚙️ .env support for convenient config

📦 Installation

pip install .
# or editable install
pip install -e .

🚀 Usage

Basic

sqlxport run \
  --db-url postgresql://user:pass@localhost:5432/mydb \
  --query "SELECT * FROM users" \
  --output-file users.parquet \
  --format parquet

With S3 Upload

sqlxport run \
  --db-url postgresql://... \
  --query "..." \
  --output-file users.parquet \
  --s3-bucket my-bucket \
  --s3-key users.parquet \
  --s3-access-key AKIA... \
  --s3-secret-key ... \
  --s3-endpoint https://s3.amazonaws.com

Partitioned Export

sqlxport run \
  --db-url postgresql://... \
  --query "..." \
  --output-dir output/ \
  --partition-by group_column

Redshift UNLOAD Mode

sqlxport run \
  --use-redshift-unload \
  --db-url redshift+psycopg2://... \
  --query "SELECT * FROM large_table" \
  --s3-output-prefix s3://bucket/unload/ \
  --iam-role arn:aws:iam::123456789012:role/MyUnloadRole

🧪 Running Tests

pytest -v

🧬 Environment Variables

You can set options via .env or environment:

DB_URL=postgresql://username:password@localhost:5432/mydb
S3_BUCKET=my-bucket
S3_KEY=data/users.parquet
S3_ACCESS_KEY=...
S3_SECRET_KEY=...
S3_ENDPOINT=https://s3.amazonaws.com
IAM_ROLE=arn:aws:iam::123456789012:role/MyUnloadRole

Generate a template with:

sqlxport run --generate-env-template

🛠 Roadmap

  • ✅ Modular format support
  • ✅ CSV support
  • ⏳ Add jsonl, xlsx formats
  • ⏳ Plugin system for custom writers/loaders
  • ⏳ SaaS mode or server-side export platform
  • ⏳ Stream output to Kafka/Kinesis

🔐 Security

  • Don't commit .env files
  • Store credentials securely (e.g. .aws/credentials, vaults)

👨‍💻 Author

Vahid Saber
Built with ❤️ for data engineers and developers.


📄 License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sqlxport-0.1.3.tar.gz (20.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sqlxport-0.1.3-py3-none-any.whl (25.8 kB view details)

Uploaded Python 3

File details

Details for the file sqlxport-0.1.3.tar.gz.

File metadata

  • Download URL: sqlxport-0.1.3.tar.gz
  • Upload date:
  • Size: 20.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for sqlxport-0.1.3.tar.gz
Algorithm Hash digest
SHA256 0c643060096a23fc82e490c6eaaf2be21203fe2f35dc6325d2fe1101953a69ad
MD5 a6e04fab8b491894636eb1d069b37c9b
BLAKE2b-256 4162e4ea429e454ff9b07330440556967c7dfc9a1b89f235b1d13d1ebe78b50d

See more details on using hashes here.

File details

Details for the file sqlxport-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: sqlxport-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 25.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for sqlxport-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 d7b9b2c0b13a2aa2aba224937c250e17f47e6f2a33bc521ed1f8a1c148c8ad04
MD5 8dd851f0ee5df315cdcc531015c2245d
BLAKE2b-256 4126a3dfc361930c98806f99ac693020b3aa1990bbfaf4f94d325fc331874abd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page