Minimal Cython-based DuckDB Bindings

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

pt1

These details have not been verified by PyPI

Project description

bareduckdb

Simplified, Dynamically Linked DuckDB Python Bindings — Fast, simple, and free-threaded.

Overview

bareduckdb provides extensible and easy to build Python bindings to DuckDB using Cython.

Simple ~2k lines of C++ and ~2k lines of Python - easy to extend or customize
Arrow-first data conversion supporting Polars, PyArrow, and Pandas
Support for latest Python features Free threading, subinterpreters, ABI3 and asyncio
Dynamically linked to DuckDB's official library
Experimental Enhancements

Experimental Enhancements

Explicit Stream vs Materialization Modes - At connection & execution time, select whether you want materialized arrow_tables or streaming arrow_readers.
Arrow Deadlock Detection - certain use cases involving reuse of Arrow Readers can cause deadlocks
Table Statistics - Extracts and passes table statistics at registration time
Polars - No PyArrow Required - Polars can be read and produced without importing / installing PyArrow
Polars - Native LazyFrame Pushdown - whereas DuckDB collects() LazyFrames before pushdown, bareduckdb pushes down native Polars predicates
Inline Registration - bareduckdb.execute("query", data={...}) allows registration at call time
User Defined Table Functions - extracts UDTFs at parse time and executes registered functions
**Appender - Row by Row ** Exposes DuckDB's appender API for fast sequential writes to duckdb databases

Installation

From PyPI

pip install bareduckdb

From Source

git clone --recurse-submodules https://github.com/paultiq/bareduckdb.git
cd bareduckdb
uv sync -v # or: pip install -e .

Basic Usage

import bareduckdb

# Connect to in-memory database
conn = bareduckdb.connect()

# Execute query and get Arrow Table
result = conn.execute("SELECT 42 as answer").arrow_table()
print(result)

# Convert to Polars/Pandas/PyArrow
df_polars = conn.execute("SELECT * FROM range(100)").pl()
df_pandas = conn.execute("SELECT * FROM range(100)").df()

Async API

import asyncio
from bareduckdb.aio import connect_async

async def run_query():
    async with await connect_async() as conn:
        result = await conn.execute("SELECT * FROM generate_series(1, 1000)")
        return result

result = asyncio.run(run_query())

Polars Integration

import bareduckdb
import polars as pl

conn = bareduckdb.connect()

# Polars -> DuckDB (Arrow Capsule protocol)
df = pl.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]})
conn.register("my_table", df)

# DuckDB -> Polars (direct conversion)
result = conn.execute("SELECT * FROM my_table", output_type="polars")

Architecture

Design Principles

Keep it in Python — Business logic lives in Python, not Cython/C++
No GIL interaction from DuckDB threads — All Python operations happen before/after query execution
Semantic Versioning — Strict stability guarantees
Arrow-first — All data types map through Arrow's type system

Why Arrow-First?

By forcing all conversions through Arrow, bareduckdb achieves:

Consistent type mappings across Polars/Pandas/PyArrow
Reduced code complexity (no per-library conversion paths)
Better memory efficiency (zero-copy where possible)
Future-proof (Arrow is the lingua franca for columnar data)

Thread Safety & Free-Threading

Free-threading support (Python 3.13+):

No global locks in critical paths
DuckDB threads never acquire the GIL
Safe concurrent query execution in --disable-gil mode
Atomic operations for Arrow stream coordination

APIs

bareduckdb provides multiple API layers for different use cases:

1. Core API (`bareduckdb.core`)

Minimal, no-frills interface for maximum performance.

from bareduckdb.core import Connection
conn = Connection()
result = conn.execute("SELECT 1")

2. Async API (`bareduckdb.aio`)

Non-blocking operations with async/await.

from bareduckdb.aio import connect_async
conn = await connect_async()
result = await conn.execute("SELECT 1")

3. Compatibility API (`bareduckdb.compat`)

Familiar interface similar to duckdb-python (with intentional differences).

import bareduckdb
conn = bareduckdb.connect()
result = conn.sql("SELECT 1")  # Eager execution

4. DBAPI 2.0 (`bareduckdb.dbapi`)

Standard Python database interface for compatibility with tools like SQLAlchemy.

from bareduckdb.dbapi import connect
conn = connect()
cursor = conn.cursor()
cursor.execute("SELECT 1")

Key Differences

Experimental Features

When pyarrow is installed, two experimental features are available -

Arrow Statistics and Cardinality

In duckdb-python, Arrow Tables, Readers and Capsules are all converted to Streams via DataSet->Scanner->Reader. These Streams have no cardinality (number of rows) nor statistics (such as: min max, number of distinct values, contains nulls).

Cardinality is used at determining whether to use TopN, which significantly speeds up (w/ less memory) "order by X limit N" queries when N is small relative to size of table. Statistics are used for query planning by the optimizer.

In bareduckdb, Arrow Tables are registered directly (as Tables, not Streams) and used by arrow_scan_dataset which can then retrieve cardinality and column level statistics.

Statistics Options:

The register() method accepts a statistics parameter to control which columns have statistics computed:

import bareduckdb

conn = bareduckdb.connect()

# No statistics (fastest registration, default)
conn.register("table", df, statistics=None)

# Numeric columns only (recommended for most use cases)
conn.register("table", df, statistics="numeric")

# All columns (slowest - includes string min/max)
conn.register("table", df, statistics=True)

# Specific columns by name
conn.register("table", df, statistics=["id", "price", "date"])

# Regex pattern to match column names
conn.register("table", df, statistics=".*_id")  # all columns ending with _id

Setting a Default:

Configure the default statistics mode at connection level:

# All register() calls will use numeric statistics by default
conn = bareduckdb.connect(default_statistics="numeric")
conn.register("table1", df1)  # uses numeric stats
conn.register("table2", df2)  # uses numeric stats
conn.register("table3", df3, statistics=False)  # override: no stats

Performance Impact (500K rows, 2 numeric + 2 string columns):

Mode	Registration Time	Use Case
`None`	~0.4ms	No filter pushdown needed
`"numeric"`	~10ms	JOIN/filter on numeric columns
`True`	~22ms	Filter pushdown on all columns

The "numeric" option provides the best balance: fast registration with statistics for the columns most commonly used in filters and JOINs (IDs, dates, prices).

Arrow Pushdown

Arrow projection and filter pushdowns are implemented using the Arrow C++ library. Pushdowns are only implemented for Tables currently.

Relational API

Use Ibis

Replacement Scans

Automatically discover Arrow tables in the caller's scope without explicit registration:

import bareduckdb
import pyarrow as pa

conn = bareduckdb.connect(enable_replacement_scan=True)
my_data = pa.table({"a": [1, 2, 3], "b": [4, 5, 6]})

result = conn.execute("SELECT * FROM my_data").arrow_table()

Customization: Override _get_replacement(name) method for custom discovery logic (e.g., loading from disk, fetching from API).

Manual Registration: Use .register() for explicit control or .execute(..., data={"name": df}) for inline registration.

Not (Yet?) Supported

No Python UDFs (scalar functions)
No fsspec integration

User Defined Table Functions

Table functions execute in Python before query execution, enabling data generation and connection injection without GIL interaction:

import bareduckdb
import pyarrow as pa

def generate_data(rows: int, multiplier: int = 1) -> pa.Table:
    return pa.table({
        "id": range(rows),
        "value": [i * multiplier for i in range(rows)]
    })

conn = bareduckdb.connect()
conn.register_udtf("generate_data", generate_data)

result = conn.execute("""
    SELECT * FROM generate_data(100, 10)
    WHERE value > 500
""").arrow_table()

Features:

AST-based query preprocessing - pure Python
Connection injection: Add conn parameter to access connection during execution
Supports any Arrow-compatible object: PyArrow Table, Polars DataFrame, Pandas DataFrame

Arrow Enhancements

Deadlock detection

Type Mappings

All types convert through Arrow:

UUIDs: Returned as strings (Arrow doesn't have native UUID type)
Decimals: Arrow Decimal128/Decimal256
Timestamps: Arrow Timestamp with timezone preservation
Nested Types: Struct/List/Map fully supported

Development

Building from Source

# Clone with submodules (sparse checkout is automatic)
git clone --recurse-submodules https://github.com/iqmo-org/bareduckdb.git
cd bareduckdb

# Install development dependencies
uv sync

# Build in development mode
pip install -e .

* Note 1: DuckDB submodule version must match the library version. * Note 2: PyArrow version must match the runtime version for Table registration / Pushdown

Disclaimer

For official Python bindings, see: https://github.com/duckdb/duckdb-python

License

bareduckdb is licensed under the MIT License. See LICENSE for details.

All original copyrights are retained by their respective owners, including DuckDB and DuckDB-Python

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

pt1

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.8.144

Jan 27, 2026

This version

0.6.143

Dec 23, 2025

0.3.142

Dec 3, 2025

0.2.142

Dec 3, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

bareduckdb-0.6.143-cp314-cp314t-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl (32.8 MB view details)

Uploaded Dec 23, 2025 CPython 3.14tmanylinux: glibc 2.26+ x86-64manylinux: glibc 2.28+ x86-64

bareduckdb-0.6.143-cp314-cp314t-macosx_11_0_arm64.whl (34.7 MB view details)

Uploaded Dec 23, 2025 CPython 3.14tmacOS 11.0+ ARM64

bareduckdb-0.6.143-cp312-abi3-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl (32.6 MB view details)

Uploaded Dec 23, 2025 CPython 3.12+manylinux: glibc 2.26+ x86-64manylinux: glibc 2.28+ x86-64

bareduckdb-0.6.143-cp312-abi3-macosx_11_0_arm64.whl (34.6 MB view details)

Uploaded Dec 23, 2025 CPython 3.12+macOS 11.0+ ARM64

File details

Details for the file bareduckdb-0.6.143-cp314-cp314t-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl.

File metadata

Download URL: bareduckdb-0.6.143-cp314-cp314t-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl
Upload date: Dec 23, 2025
Size: 32.8 MB
Tags: CPython 3.14t, manylinux: glibc 2.26+ x86-64, manylinux: glibc 2.28+ x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for bareduckdb-0.6.143-cp314-cp314t-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl
Algorithm	Hash digest
SHA256	`a03d0905349aa8cc65778023998c05e538e7c036b8c69790600b7dfe803104d9`
MD5	`0e5dcd63a89fa9b90c9c7fff8075a7a2`
BLAKE2b-256	`b01b5effd761617ffb4725b2149ae0b159236ab33099ce5dcdded28467f8f153`

See more details on using hashes here.

Provenance

The following attestation bundles were made for bareduckdb-0.6.143-cp314-cp314t-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl:

Publisher: build_wheels.yml on iqmo-org/bareduckdb

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: bareduckdb-0.6.143-cp314-cp314t-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl
- Subject digest: a03d0905349aa8cc65778023998c05e538e7c036b8c69790600b7dfe803104d9
- Sigstore transparency entry: 777800758
- Sigstore integration time: Dec 23, 2025
Source repository:
- Permalink: iqmo-org/bareduckdb@2efa9039bf1e0fbae2035515b9b13ac582fca7bf
- Branch / Tag: refs/tags/v0.6.143
- Owner: https://github.com/iqmo-org
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: build_wheels.yml@2efa9039bf1e0fbae2035515b9b13ac582fca7bf
- Trigger Event: workflow_dispatch

File details

Details for the file bareduckdb-0.6.143-cp314-cp314t-macosx_11_0_arm64.whl.

File metadata

Download URL: bareduckdb-0.6.143-cp314-cp314t-macosx_11_0_arm64.whl
Upload date: Dec 23, 2025
Size: 34.7 MB
Tags: CPython 3.14t, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for bareduckdb-0.6.143-cp314-cp314t-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`f3e3d69aef6f402886bd9b063d206e346b7266a78c4f9e97aced79fb625c9c03`
MD5	`87e7c97a2ba1e9043ff1693f52b8618e`
BLAKE2b-256	`16be92d28f32e1786c044ce23437b788c20fc9b8a0652359e2b6b996fb20700d`

See more details on using hashes here.

Provenance

The following attestation bundles were made for bareduckdb-0.6.143-cp314-cp314t-macosx_11_0_arm64.whl:

Publisher: build_wheels.yml on iqmo-org/bareduckdb

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: bareduckdb-0.6.143-cp314-cp314t-macosx_11_0_arm64.whl
- Subject digest: f3e3d69aef6f402886bd9b063d206e346b7266a78c4f9e97aced79fb625c9c03
- Sigstore transparency entry: 777800747
- Sigstore integration time: Dec 23, 2025
Source repository:
- Permalink: iqmo-org/bareduckdb@2efa9039bf1e0fbae2035515b9b13ac582fca7bf
- Branch / Tag: refs/tags/v0.6.143
- Owner: https://github.com/iqmo-org
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: build_wheels.yml@2efa9039bf1e0fbae2035515b9b13ac582fca7bf
- Trigger Event: workflow_dispatch

File details

Details for the file bareduckdb-0.6.143-cp312-abi3-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl.

File metadata

Download URL: bareduckdb-0.6.143-cp312-abi3-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl
Upload date: Dec 23, 2025
Size: 32.6 MB
Tags: CPython 3.12+, manylinux: glibc 2.26+ x86-64, manylinux: glibc 2.28+ x86-64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for bareduckdb-0.6.143-cp312-abi3-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl
Algorithm	Hash digest
SHA256	`a377d4b435464acf6429d6158c58c2c605d486e6515829cfa736976a5c770f86`
MD5	`04bf78014691f6889ce9803ffb2818ab`
BLAKE2b-256	`5fd866c3353fa8d931ba3f267a8b3b26e4ea1a2443a0b03bd1c65b215996c1ee`

See more details on using hashes here.

Provenance

The following attestation bundles were made for bareduckdb-0.6.143-cp312-abi3-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl:

Publisher: build_wheels.yml on iqmo-org/bareduckdb

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: bareduckdb-0.6.143-cp312-abi3-manylinux_2_26_x86_64.manylinux_2_28_x86_64.whl
- Subject digest: a377d4b435464acf6429d6158c58c2c605d486e6515829cfa736976a5c770f86
- Sigstore transparency entry: 777800735
- Sigstore integration time: Dec 23, 2025
Source repository:
- Permalink: iqmo-org/bareduckdb@2efa9039bf1e0fbae2035515b9b13ac582fca7bf
- Branch / Tag: refs/tags/v0.6.143
- Owner: https://github.com/iqmo-org
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: build_wheels.yml@2efa9039bf1e0fbae2035515b9b13ac582fca7bf
- Trigger Event: workflow_dispatch

File details

Details for the file bareduckdb-0.6.143-cp312-abi3-macosx_11_0_arm64.whl.

File metadata

Download URL: bareduckdb-0.6.143-cp312-abi3-macosx_11_0_arm64.whl
Upload date: Dec 23, 2025
Size: 34.6 MB
Tags: CPython 3.12+, macOS 11.0+ ARM64
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for bareduckdb-0.6.143-cp312-abi3-macosx_11_0_arm64.whl
Algorithm	Hash digest
SHA256	`97eed78b44e3480124079c571b1dbe1318f4e3132e7fcacd0407347c0b6a461d`
MD5	`3e27736187b73e8f32a91687ab065bfe`
BLAKE2b-256	`7a7736c20907281b4ab75058573635b0ab75ab78394a3a8511d011918527bd74`

See more details on using hashes here.

Provenance

The following attestation bundles were made for bareduckdb-0.6.143-cp312-abi3-macosx_11_0_arm64.whl:

Publisher: build_wheels.yml on iqmo-org/bareduckdb

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: bareduckdb-0.6.143-cp312-abi3-macosx_11_0_arm64.whl
- Subject digest: 97eed78b44e3480124079c571b1dbe1318f4e3132e7fcacd0407347c0b6a461d
- Sigstore transparency entry: 777800742
- Sigstore integration time: Dec 23, 2025
Source repository:
- Permalink: iqmo-org/bareduckdb@2efa9039bf1e0fbae2035515b9b13ac582fca7bf
- Branch / Tag: refs/tags/v0.6.143
- Owner: https://github.com/iqmo-org
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: build_wheels.yml@2efa9039bf1e0fbae2035515b9b13ac582fca7bf
- Trigger Event: workflow_dispatch

bareduckdb 0.6.143

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

bareduckdb

Overview

Experimental Enhancements

Installation

From PyPI

From Source

Basic Usage

Async API

Polars Integration

Architecture

Design Principles

Why Arrow-First?

Thread Safety & Free-Threading

APIs

1. Core API (bareduckdb.core)

2. Async API (bareduckdb.aio)

3. Compatibility API (bareduckdb.compat)

4. DBAPI 2.0 (bareduckdb.dbapi)

Key Differences

Experimental Features

Arrow Statistics and Cardinality

Arrow Pushdown

Relational API

Replacement Scans

Not (Yet?) Supported

User Defined Table Functions

Arrow Enhancements

Type Mappings

Development

Building from Source

Disclaimer

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distributions

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

1. Core API (`bareduckdb.core`)

2. Async API (`bareduckdb.aio`)

3. Compatibility API (`bareduckdb.compat`)

4. DBAPI 2.0 (`bareduckdb.dbapi`)