Use JDBC database drivers from Python 3 with a DB-API, accelerated with Apache Arrow.

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

HenryNebula

These details have not been verified by PyPI

Project description

JayDeBeApiArrow - High-Performance JDBC to Python DB-API Bridge

The JayDeBeApiArrow module allows you to connect from Python code to databases using Java JDBC. It provides a Python DB-API v2.0 to that database.

Note: This is a fork of the original JayDeBeApi project.

Key Differences in this Fork

High Performance with Apache Arrow: The primary goal of this fork is to significantly improve data fetch performance. Instead of iterating through JDBC ResultSets row-by-row in Python (which has high overhead), this library uses a custom Java extension (arrow-jdbc-extension) to convert JDBC data into Apache Arrow record batches directly within the JVM. These batches are then efficiently transferred to Python.
Modernization:
- Python 3 Only: Support for Python 2 has been removed.
- JPype Only: Support for Jython has been removed to focus on the CPython + JPype architecture.
- Strict Typing: Enforces stricter typing for Decimal and temporal types.

It works on ordinary Python (cPython) using the JPype Java integration.

Install

You can get and install JayDeBeApiArrow with pip:

pip install JayDeBeApiArrow

Or you can get a copy of the source by cloning from the JayDeBeApiArrow github project and install with:

uv sync

Ensure that you have installed JPype properly (it will be installed automatically by uv sync).

Usage

Basically you just import the jaydebeapiarrow Python module and execute the connect method. This gives you a DB-API conform connection to the database.

The first argument to connect is the name of the Java driver class. The second argument is a string with the JDBC connection URL. Third you can optionally supply a sequence consisting of user and password or alternatively a dictionary containing arguments that are internally passed as properties to the Java DriverManager.getConnection method. See the Javadoc of DriverManager class for details.

The next parameter to connect is optional as well and specifies the jar-Files of the driver if your classpath isn't set up sufficiently yet. The classpath set in CLASSPATH environment variable will be honored.

Here is an example:

import jaydebeapiarrow
conn = jaydebeapiarrow.connect(
    "org.hsqldb.jdbcDriver",
    "jdbc:hsqldb:mem:.",
    ["SA", ""],
    "/path/to/hsqldb.jar"
)
curs = conn.cursor()
curs.execute('create table CUSTOMER'
             '("CUST_ID" INTEGER not null,'
             ' "NAME" VARCHAR(50) not null,'
             ' primary key ("CUST_ID"))')
curs.execute("insert into CUSTOMER values (?, ?)", (1, 'John'))
curs.execute("select * from CUSTOMER")
print(curs.fetchall())
# Output: [(1, 'John')]
curs.close()
conn.close()

If you're having trouble getting this work check if your JAVA_HOME environment variable is set correctly. For example:

JAVA_HOME=/usr/lib/jvm/java-8-openjdk python

An alternative way to establish connection using connection properties:

conn = jaydebeapiarrow.connect(
    "org.hsqldb.jdbcDriver",
    "jdbc:hsqldb:mem:.",
    {
        'user': "SA", 'password': "",
        'other_property': "foobar"
    },
    "/path/to/hsqldb.jar"
)

Also using the with statement might be handy:

with jaydebeapiarrow.connect(
    "org.hsqldb.jdbcDriver",
    "jdbc:hsqldb:mem:.",
    ["SA", ""],
    "/path/to/hsqldb.jar"
) as conn:
    with conn.cursor() as curs:
        curs.execute("select count(*) from CUSTOMER")
        print(curs.fetchall())
        # Output: [(1,)]

Supported Databases

In theory every database with a suitable JDBC driver should work. It is confirmed to work with the following databases:

SQLite
Hypersonic SQL (HSQLDB)
IBM DB2
IBM DB2 for mainframes
Oracle
Teradata DB
Netezza
Mimer DB
Microsoft SQL Server
MySQL
PostgreSQL
...and many more.

Testing

Integration tests are located in test/. Tests run via pytest and cover all supported databases: SQLite (in-memory), HSQLDB, PostgreSQL, MySQL, MSSQL, Oracle, DB2, Trino, and Apache Drill.

Build JARs and download drivers

uv run bash test/build.sh                 # Build arrow-jdbc-extension and MockDriver JARs
uv run bash test/download_jdbc_drivers.sh # Download JDBC drivers

Run tests

CLASSPATH="test/jars/*:test/mock-jars/*" uv run pytest test/test_mock.py test/test_infrastructure.py -v   # Mock + infrastructure
CLASSPATH="test/jars/*" uv run pytest test/test_hsqldb.py -v                                                # HSQLDB
CLASSPATH="test/jars/*" uv run pytest test/test_sqlite.py::SqliteXerialTest -v                              # SQLite JDBC
CLASSPATH="test/jars/*" uv run pytest test/ -v --tb=short                                                  # All tests

Pytest is configured in pyproject.toml to run tests in parallel across files using pytest-xdist with --dist loadfile.

External database tests

Container-based databases are managed via Docker Compose:

# Start all databases
cd test && docker compose up -d

# Check status
cd test && docker compose ps

# Stop all databases
cd test && docker compose down

Database connection defaults (overridable via environment variables):

Database	Host	Port	DB	User	Password	Env prefix
PostgreSQL	localhost	15432	test_db	user	password	`JY_PG_*`
MySQL	localhost	13306	test_db	user	password	`JY_MYSQL_*`
MSSQL	localhost	11433	—	sa	Password123!	`JY_MSSQL_*`
Oracle	localhost	11521	XEPDB1	system	Password123!	`JY_ORACLE_*`
DB2	localhost	15000	test_db	db2inst1	Password123!	`JY_DB2_*`
Trino	localhost	18080	—	test	—	`JY_TRINO_*`
Drill	localhost	31010	—	—	—	`JY_DRILL_*`

Benchmarks

This approach was inspired by Uwe Korn's work on pyarrow.jvm (Apache Drill) and Razvi Noorul's Trino benchmarks, both demonstrating 100x+ speedups by using Arrow to bypass JPype's row-by-row serialization.

Our benchmarks (local PostgreSQL, 5M rows, 4 columns) show a 23.7x speedup over plain jaydebeapi using the Native Arrow API. The difference in multiplier vs the referenced posts is due to methodology: they tested against distributed query engines (Drill, Trino) over network connections with higher per-row JDBC overhead. PostgreSQL's JDBC driver is fast at row retrieval, so the baseline is lower. The absolute Arrow throughput is comparable across all three.

The reading path uses the Arrow C Data Interface (Data.exportVectorSchemaRoot → pa.RecordBatch._import_from_c), which bypasses pyarrow.jvm entirely. This brings the Native Arrow API to within 6% of psycopg2, a native C driver.

Method	5M rows	Throughput	vs jaydebeapi
jaydebeapi (baseline)	180.1s	28K rows/s	—
Drop-in replacement	26.5s	189K rows/s	6.8x
Native Arrow API (C Data Interface)	7.6s	658K rows/s	23.7x
Psycopg2 (native driver)	7.2s	694K rows/s	25.0x

See benchmark/ for scripts to reproduce these results.

Contributing

Please submit bugs and patches to the JayDeBeApiArrow issue tracker. All contributors will be acknowledged. Thanks!

License

JayDeBeApiArrow is released under the GNU Lesser General Public license (LGPL). See the file COPYING and COPYING.LESSER in the distribution for details.

Project details

These details have been verified by PyPI

Project links

Homepage

GitHub Statistics

Maintainers

HenryNebula

These details have not been verified by PyPI

Release history Release notifications | RSS feed

2.1.5

Jun 2, 2026

This version

2.1.4

May 10, 2026

2.1.3

Apr 22, 2026

2.1.3.dev1 pre-release

Apr 22, 2026

2.1.2rc1 pre-release

Apr 22, 2026

2.1.1

Apr 15, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jaydebeapiarrow-2.1.4.tar.gz (9.7 MB view details)

Uploaded May 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

jaydebeapiarrow-2.1.4-py3-none-any.whl (9.7 MB view details)

Uploaded May 10, 2026 Python 3

File details

Details for the file jaydebeapiarrow-2.1.4.tar.gz.

File metadata

Download URL: jaydebeapiarrow-2.1.4.tar.gz
Upload date: May 10, 2026
Size: 9.7 MB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for jaydebeapiarrow-2.1.4.tar.gz
Algorithm	Hash digest
SHA256	`ebaa0a9270e20be1b86842dc0ccba499e5c13cbd22f63c1793df2566d44a9aee`
MD5	`0f85fcec65f48ba1f4dd13737a95ddc1`
BLAKE2b-256	`dbc4531cc1ffa469a7151896e73fd19c57f5e05811c1954c10dc5babb485cb4f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for jaydebeapiarrow-2.1.4.tar.gz:

Publisher: publish.yml on HenryNebula/jaydebeapiarrow

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: jaydebeapiarrow-2.1.4.tar.gz
- Subject digest: ebaa0a9270e20be1b86842dc0ccba499e5c13cbd22f63c1793df2566d44a9aee
- Sigstore transparency entry: 1496005631
- Sigstore integration time: May 10, 2026
Source repository:
- Permalink: HenryNebula/jaydebeapiarrow@12d868a62b232560d977b721f15337c83ac18843
- Branch / Tag: refs/tags/v2.1.4
- Owner: https://github.com/HenryNebula
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@12d868a62b232560d977b721f15337c83ac18843
- Trigger Event: release

File details

Details for the file jaydebeapiarrow-2.1.4-py3-none-any.whl.

File metadata

Download URL: jaydebeapiarrow-2.1.4-py3-none-any.whl
Upload date: May 10, 2026
Size: 9.7 MB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for jaydebeapiarrow-2.1.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`0d880183f3f05f3fb0cc6e889829eca9296abc93736f99cff7d8c4f7da4839c4`
MD5	`ce7f74229cb028dbee79095febc0d75f`
BLAKE2b-256	`52e7986dd603286381439232819a3621ba2d4ade48b2ff4e97f48cc0c22b074b`

See more details on using hashes here.

Provenance

The following attestation bundles were made for jaydebeapiarrow-2.1.4-py3-none-any.whl:

Publisher: publish.yml on HenryNebula/jaydebeapiarrow

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: jaydebeapiarrow-2.1.4-py3-none-any.whl
- Subject digest: 0d880183f3f05f3fb0cc6e889829eca9296abc93736f99cff7d8c4f7da4839c4
- Sigstore transparency entry: 1496006239
- Sigstore integration time: May 10, 2026
Source repository:
- Permalink: HenryNebula/jaydebeapiarrow@12d868a62b232560d977b721f15337c83ac18843
- Branch / Tag: refs/tags/v2.1.4
- Owner: https://github.com/HenryNebula
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@12d868a62b232560d977b721f15337c83ac18843
- Trigger Event: release

JayDeBeApiArrow 2.1.4

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

JayDeBeApiArrow - High-Performance JDBC to Python DB-API Bridge

Key Differences in this Fork

Install

Usage

Supported Databases

Testing

Build JARs and download drivers

Run tests

External database tests

Benchmarks

Contributing

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance