An open-source database migration engine powered by Apache Arrow
Project description
Bani
An open-source database migration engine powered by Apache Arrow.
Bani migrates schema, data, and indexes across relational databases using Apache Arrow as a universal columnar interchange format. Define migrations declaratively with BDL, programmatically with the Python SDK, or let an AI agent drive them via the MCP server.
Features
- 5 Database Connectors -- PostgreSQL, MySQL, MSSQL, Oracle, SQLite
- Apache Arrow Engine -- Columnar interchange for high-throughput batch transfers
- Declarative BDL -- Define migrations in XML or JSON, version control everything
- Python SDK -- Fluent
ProjectBuilderAPI for programmatic migrations - CLI -- 11 commands: run, validate, preview, init, schema inspect, and more
- MCP Server -- 10 tools for AI agents (Claude, Cursor, etc.)
- Web Dashboard -- Real-time migration monitoring with React UI
- Cross-Platform -- macOS app, Linux packages, Windows installer, Docker
Quick Start
Install
Download the installer for your platform from the releases page (macOS .dmg, Windows .exe, Linux .deb/.rpm/AppImage), or use Docker:
docker pull banilabs/bani:latest
Your First Migration
- Set up database credentials as environment variables:
export SOURCE_USER=myuser
export SOURCE_PASS=mypassword
export TARGET_USER=pguser
export TARGET_PASS=pgpassword
- Create a migration project:
bani init --source mysql --target postgresql --out my-migration.bdl
- Run the migration:
bani run my-migration.bdl
Python SDK
from bani.sdk import BaniProject, ProjectBuilder
project = (
ProjectBuilder("my-migration")
.source("mysql", host="localhost", port=3306, database="source_db",
username_env="SOURCE_USER", password_env="SOURCE_PASS")
.target("postgresql", host="localhost", port=5432, database="target_db",
username_env="TARGET_USER", password_env="TARGET_PASS")
.batch_size(100_000)
.build()
)
result = BaniProject(project).run()
print(f"Migrated {result.total_rows_written} rows in {result.duration_seconds:.1f}s")
Or load from a BDL file:
from bani.sdk import Bani
result = Bani.load("my-migration.bdl").run()
MCP Server (AI Agent Integration)
Add to your Claude Desktop configuration:
{
"mcpServers": {
"bani": {
"command": "bani",
"args": ["mcp", "start"]
}
}
}
Then ask Claude: "Inspect the schema of my MySQL database and generate a migration to PostgreSQL."
Supported Databases
| Database | Versions | Source | Sink | Driver |
|---|---|---|---|---|
| PostgreSQL | 9.6 -- 17 | Yes | Yes | psycopg 3.x |
| MySQL | 5.5 -- 8.4 | Yes | Yes | PyMySQL |
| SQL Server | 2019 -- 2022 | Yes | Yes | pyodbc / pymssql |
| Oracle | 11g -- 23c | Yes | Yes | oracledb |
| SQLite | 3.x | Yes | Yes | sqlite3 (stdlib) |
Architecture
Bani uses Apache Arrow RecordBatch as its universal data interchange format. Source connectors read database rows into Arrow batches; sink connectors write Arrow batches to the target database. This gives N type mappers (one per connector) instead of N x N, and enables high-throughput columnar transfers with minimal Python overhead.
Source DB --> Source Connector --> Arrow RecordBatch --> Sink Connector --> Target DB
Key components:
- Connectors -- Pluggable source/sink pairs discovered via Python entry points
- Orchestrator -- Manages table ordering (dependency-aware), batching, parallelism, and checkpointing
- BDL Parser -- Reads XML or JSON migration definitions into a
ProjectModel - SDK --
ProjectBuilderfor programmatic construction,SchemaInspectorfor introspection - MCP Server -- Exposes migration tools to AI agents via the Model Context Protocol
Web UI
Launch the web dashboard:
bani ui
Monitor migrations in real-time with progress tracking, table-level status, and error reporting.
Documentation
Full documentation is available at docs.bani.tools:
- Getting Started -- Install and run your first migration in under 10 minutes
- BDL Reference -- Complete specification for the Bani Definition Language
- CLI Reference -- All commands, flags, and output formats
- Python SDK -- Programmatic migration API
- MCP Server -- AI agent integration guide
- Connector Reference -- Per-database configuration and type mappings
Development
# Clone and install
git clone https://github.com/mugumedavid/bani.git
cd bani
uv sync --all-extras --dev
# Run quality gates
ruff check && ruff format --check
mypy --strict
pytest
# Or use make
make all
See CONTRIBUTING.md for the full development guide.
License
Apache-2.0. See LICENSE for details.
Links
- Website -- Project homepage
- Documentation -- Technical docs and guides
- GitHub -- Source code and issues
- Docker Hub -- Container images
- Discord -- Community chat
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file bani_tools-1.0.0.tar.gz.
File metadata
- Download URL: bani_tools-1.0.0.tar.gz
- Upload date:
- Size: 620.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9af15ac5689490dca4535df714c9bea33a35d32c63da157a1ec94a1dbcc92f69
|
|
| MD5 |
90884a74b6fb460eec097b0ef41b37c6
|
|
| BLAKE2b-256 |
7f88b8bf602efa930d9796cdd7df4e038b83f26c147bf24d3066682bb49ac6cd
|
Provenance
The following attestation bundles were made for bani_tools-1.0.0.tar.gz:
Publisher:
release.yml on mugumedavid/bani
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
bani_tools-1.0.0.tar.gz -
Subject digest:
9af15ac5689490dca4535df714c9bea33a35d32c63da157a1ec94a1dbcc92f69 - Sigstore transparency entry: 1352496206
- Sigstore integration time:
-
Permalink:
mugumedavid/bani@008c02f9be3842d248bb49e82e18e00805bd958c -
Branch / Tag:
refs/tags/v1.0.0-rc5 - Owner: https://github.com/mugumedavid
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@008c02f9be3842d248bb49e82e18e00805bd958c -
Trigger Event:
push
-
Statement type:
File details
Details for the file bani_tools-1.0.0-py3-none-any.whl.
File metadata
- Download URL: bani_tools-1.0.0-py3-none-any.whl
- Upload date:
- Size: 299.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
31755c6973c7b4d797a18a4f439c93fea110417e5b578d6e97179ead74edba29
|
|
| MD5 |
5c814a131d9c5a751211b24f0da8afea
|
|
| BLAKE2b-256 |
d469103eed6af0cfa8dcb2f3c480ca47d398ae8c7049653750bf4e2fe91f5912
|
Provenance
The following attestation bundles were made for bani_tools-1.0.0-py3-none-any.whl:
Publisher:
release.yml on mugumedavid/bani
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
bani_tools-1.0.0-py3-none-any.whl -
Subject digest:
31755c6973c7b4d797a18a4f439c93fea110417e5b578d6e97179ead74edba29 - Sigstore transparency entry: 1352496303
- Sigstore integration time:
-
Permalink:
mugumedavid/bani@008c02f9be3842d248bb49e82e18e00805bd958c -
Branch / Tag:
refs/tags/v1.0.0-rc5 - Owner: https://github.com/mugumedavid
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@008c02f9be3842d248bb49e82e18e00805bd958c -
Trigger Event:
push
-
Statement type: