A command-line tool to split large SQL files into smaller chunks based on size and SQL separators.
Project description
SQL Splitter 🚀
A high-performance command-line tool designed to split massive SQL dump files into smaller, manageable chunks. Unlike simple line-based splitters, SQL Splitter respects SQL statement boundaries (using separators like ;) to ensure each chunk remains a valid SQL script.
✨ Features
- Smart Splitting: Splits files based on size while respecting SQL statement integrity.
- Compression: Optionally remove comments and empty lines to reduce chunk size.
- Configurable: Define custom separators, single-line comment markers, and multi-line comment markers.
- Fast & Efficient: Processes files line-by-line using streaming I/O, minimizing memory usage for multi-gigabyte files.
🛠 Prerequisites
- Python: 3.13 or higher.
- uv: Recommended for dependency management (fast, reliable).
🚀 Installation & Setup
We use uv for easy environment management.
-
Clone the repository:
git clone <repository-url> cd sql_splitter
-
Sync dependencies:
uv sync
📖 Usage Guide
Command Line Arguments
Run the tool using uv run psql-splitter.
| Argument | Long Flag | Required | Default | Description |
|---|---|---|---|---|
-f |
N/A | Yes | - | Path to the source SQL file. |
-n |
N/A | Yes | - | Number of chunks to split the file into. |
-s |
N/A | No | ; |
SQL statement separator. |
-c |
N/A | No | -- |
Single-line comment character. |
-m |
N/A | No | /* |
Multi-line comment character start. |
-z |
N/A | No | False | Flag to compress output (removes empty lines/comments). |
Examples
Basic split into 5 chunks:
uv run psql-splitter -f big_dump.sql -n 5
Split with compression and custom separator:
uv run psql-splitter -f my_data.sql -n 3 -s "$$" -z
🧑💻 Developer Guide
If you are a developer joining the project, here is how you can work with the codebase.
Project Structure
.
├── main.py # CLI Entry point
├── Makefile # Automated task runner
├── pyproject.toml # Project metadata and dependencies
├── src/
│ ├── splitter.py # Core splitting logic
│ └── tests/ # Unit tests
└── README.md # This file
Automation with Makefile
The project includes a Makefile for common tasks:
-
Run Example:
make runRuns the splitter on a
test_dump.sqlfile (cleans previous chunks first). -
Run Tests:
make test
Executes the test suite using
pytest. -
Cleanup:
make cleanRemoves all generated
.sqlchunks (files matching[0-9]*.sql). -
Help:
make help
Lists available commands.
Running Tests Manually
You can also run tests directly via uv:
uv run pytest src/tests -v
📝 How it works
- Size Calculation: The tool calculates the total file size and determines a target chunk size by dividing it by
-n. - Streaming Read: It reads the input file line-by-line to handle extremely large files without filling up RAM.
- Statement Boundary: It only closes a chunk if it has exceeded the target size and the current line ends with the specified separator (
-s). - Compression Mode: When
-zis enabled, the tool skips lines that are empty or start with the specified comment characters (-cor-m).
📄 License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file psql_splitter-1.0.0.tar.gz.
File metadata
- Download URL: psql_splitter-1.0.0.tar.gz
- Upload date:
- Size: 9.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ac557ce4a51ac944253daccd608286a1c7bb31cea3073a7caf9965e0958aef67
|
|
| MD5 |
9af1d9957eca2ce5247a6654f5c1f79d
|
|
| BLAKE2b-256 |
04ab0e9adfd760e2ca0b02c120a7748c3c7b79c7e7d8e742fb15f953090e15d6
|
File details
Details for the file psql_splitter-1.0.0-py3-none-any.whl.
File metadata
- Download URL: psql_splitter-1.0.0-py3-none-any.whl
- Upload date:
- Size: 7.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.13.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3cfac2e68fd0f72fbf75284a83308326d3af44afb2d036bd4edb7172bb74704d
|
|
| MD5 |
ac9edc4892cf46f41fe6f8cd70dce302
|
|
| BLAKE2b-256 |
02340b22df1621fda86a8bb6f04ba0ea4586e1db832a3a43ba27178f48217b05
|