SQLAlchemy driver for duckdb

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Mause

These details have not been verified by PyPI

Project description

duckdb_engine

Basic SQLAlchemy driver for DuckDB

duckdb_engine

Installation

$ pip install duckdb-engine

DuckDB Engine also has a conda feedstock available, the instructions for the use of which are available in it's repository.

Usage

Once you've installed this package, you should be able to just use it, as SQLAlchemy does a python path search

from sqlalchemy import Column, Integer, Sequence, String, create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm.session import Session

Base = declarative_base()


class FakeModel(Base):  # type: ignore
    __tablename__ = "fake"

    id = Column(Integer, Sequence("fakemodel_id_sequence"), primary_key=True)
    name = Column(String)


eng = create_engine("duckdb:///:memory:")
Base.metadata.create_all(eng)
session = Session(bind=eng)

session.add(FakeModel(name="Frank"))
session.commit()

frank = session.query(FakeModel).one()

assert frank.name == "Frank"

Usage in IPython/Jupyter

With IPython-SQL and DuckDB-Engine you can query DuckDB natively in your notebook! Check out DuckDB's documentation or Alex Monahan's great demo of this on his blog.

Configuration

You can configure DuckDB by passing connect_args to the create_engine function

create_engine(
    'duckdb:///:memory:',
    connect_args={
        'read_only': False,
        'config': {
            'memory_limit': '500mb'
        }
    }
)

The supported configuration parameters are listed in the DuckDB docs

How to register a pandas DataFrame

conn = create_engine("duckdb:///:memory:").connect()

# with SQLAlchemy 1.3
conn.execute("register", ("dataframe_name", pd.DataFrame(...)))

# with SQLAlchemy 1.4+
conn.execute(text("register(:name, :df)"), {"name": "test_df", "df": df})

conn.execute("select * from dataframe_name")

Things to keep in mind

Duckdb's SQL parser is based on the PostgreSQL parser, but not all features in PostgreSQL are supported in duckdb. Because the duckdb_engine dialect is derived from the postgresql dialect, SQLAlchemy may try to use PostgreSQL-only features. Below are some caveats to look out for.

Auto-incrementing ID columns

When defining an Integer column as a primary key, SQLAlchemy uses the SERIAL datatype for PostgreSQL. Duckdb does not yet support this datatype because it's a non-standard PostgreSQL legacy type, so a workaround is to use the SQLAlchemy.Sequence() object to auto-increment the key. For more information on sequences, you can find the SQLAlchemy Sequence documentation here.

The following example demonstrates how to create an auto-incrementing ID column for a simple table:

>>> import sqlalchemy
>>> engine = sqlalchemy.create_engine('duckdb:////path/to/duck.db')
>>> metadata = sqlalchemy.MetaData(engine)
>>> user_id_seq = sqlalchemy.Sequence('user_id_seq')
>>> users_table = sqlalchemy.Table(
...     'users',
...     metadata,
...     sqlalchemy.Column(
...         'id',
...         sqlalchemy.Integer,
...         user_id_seq,
...         server_default=user_id_seq.next_value(),
...         primary_key=True,
...     ),
... )
>>> metadata.create_all(bind=engine)

Pandas `read_sql()` chunksize

NOTE: this is no longer an issue in versions >=0.5.0 of duckdb

The pandas.read_sql() method can read tables from duckdb_engine into DataFrames, but the sqlalchemy.engine.result.ResultProxy trips up when fetchmany() is called. Therefore, for now chunksize=None (default) is necessary when reading duckdb tables into DataFrames. For example:

>>> import pandas as pd
>>> import sqlalchemy
>>> engine = sqlalchemy.create_engine('duckdb:////path/to/duck.db')
>>> df = pd.read_sql('users', engine)                ### Works as expected
>>> df = pd.read_sql('users', engine, chunksize=25)  ### Throws an exception

Unsigned integer support

Unsigned integers are supported by DuckDB, and are available in duckdb_engine.datatypes.

Alembic Integration

SQLAlchemy's companion library alembic can optionally be used to manage database migrations.

This support can be enabling by adding an Alembic implementation class for the duckdb dialect.

from alembic.ddl.impl import DefaultImpl

class AlembicDuckDBImpl(DefaultImpl):
    """Alembic implementation for DuckDB."""

    __dialect__ = "duckdb"

After loading this class with your program, Alembic will no longer raise an error when generating or applying migrations.

Preloading extensions (experimental)

DuckDB 0.9.0+ includes builtin support for autoinstalling and autoloading of extensions, see the extension documentation for more information.

Until the DuckDB python client allows you to natively preload extensions, I've added experimental support via a connect_args parameter

from sqlalchemy import create_engine

create_engine(
    'duckdb:///:memory:',
    connect_args={
        'preload_extensions': ['https'],
        'config': {
            's3_region': 'ap-southeast-1'
        }
    }
)

Registering Filesystems

DuckDB allows registering filesystems from fsspec, see documentation for more information.

Support is provided under connect_args parameter

from sqlalchemy import create_engine
from fsspec import filesystem

create_engine(
    'duckdb:///:memory:',
    connect_args={
        'register_filesystems': [filesystem('gcs')],
    }
)

The name

Yes, I'm aware this package should be named duckdb-driver or something, I wasn't thinking when I named it and it's too hard to change the name now

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Mause

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.17.0

Mar 29, 2025

0.16.0

Mar 29, 2025

0.15.1

Mar 29, 2025

0.15.0

Jan 16, 2025

0.14.2

Jan 10, 2025

0.14.1

Jan 10, 2025

0.14.0

Dec 16, 2024

0.13.6

Nov 24, 2024

0.13.5

Nov 7, 2024

0.13.4

Oct 22, 2024

0.13.3

Oct 21, 2024

0.13.2

Sep 4, 2024

0.13.1

Jul 29, 2024

0.13.0

Jun 2, 2024

0.12.1

May 24, 2024

0.12.0

Apr 21, 2024

0.12.0rc0 pre-release

Apr 18, 2024

0.11.5

Apr 16, 2024

0.11.4

Apr 9, 2024

0.11.3

Apr 7, 2024

0.11.2

Mar 1, 2024

0.11.1

Feb 6, 2024

0.11.0

Feb 4, 2024

0.10.0

Dec 24, 2023

0.9.5

Dec 21, 2023

0.9.4

Dec 9, 2023

0.9.3

Dec 5, 2023

0.9.2

Jul 23, 2023

0.9.1

Jul 14, 2023

0.9.0

Jun 21, 2023

0.8.0

Jun 20, 2023

0.7.3

May 19, 2023

0.7.2

May 17, 2023

0.7.1

May 9, 2023

0.7.0

Mar 16, 2023

0.7.0rc1 pre-release

Mar 7, 2023

0.6.9

Mar 1, 2023

0.6.8

Jan 8, 2023

0.6.7

Jan 7, 2023

0.6.6

Dec 17, 2022

0.6.5

Nov 22, 2022

0.6.4

Sep 11, 2022

0.6.3

Sep 8, 2022

0.6.2

Aug 25, 2022

0.6.1

Aug 23, 2022

0.6.0

Aug 22, 2022

0.5.0

Aug 19, 2022

0.4.0

Aug 15, 2022

0.3.4

Aug 12, 2022

0.3.3

Aug 6, 2022

0.3.2

Aug 5, 2022

0.3.1

Aug 5, 2022

0.3.0

Aug 2, 2022

0.2.0

Jul 3, 2022

0.1.12a0 pre-release

Jun 23, 2022

0.1.11

Jun 21, 2022

0.1.10

Jun 19, 2022

0.1.9

Jun 14, 2022

0.1.9a0 pre-release

Jun 14, 2022

0.1.8

Feb 2, 2022

0.1.8rc4 pre-release

Aug 20, 2021

0.1.8rc3 pre-release

Aug 18, 2021

0.1.8rc2 pre-release

Aug 18, 2021

0.1.8rc1 pre-release

Aug 17, 2021

0.1.7

Jul 13, 2021

0.1.6

Jul 5, 2021

0.1.5

Jun 14, 2021

0.1.4

May 20, 2021

0.1.3

May 6, 2021

0.1.2

Oct 25, 2020

0.1.1

Oct 24, 2020

0.1.0

Sep 29, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

duckdb_engine-0.17.0.tar.gz (48.1 kB view details)

Uploaded Mar 29, 2025 Source

Built Distribution

duckdb_engine-0.17.0-py3-none-any.whl (49.7 kB view details)

Uploaded Mar 29, 2025 Python 3

File details

Details for the file duckdb_engine-0.17.0.tar.gz.

File metadata

Download URL: duckdb_engine-0.17.0.tar.gz
Upload date: Mar 29, 2025
Size: 48.1 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for duckdb_engine-0.17.0.tar.gz
Algorithm	Hash digest
SHA256	`396b23869754e536aa80881a92622b8b488015cf711c5a40032d05d2cf08f3cf`
MD5	`0cb2cda221665a6bdc8b379173f86110`
BLAKE2b-256	`89d5c0d8d0a4ca3ffea92266f33d92a375e2794820ad89f9be97cf0c9a9697d0`

See more details on using hashes here.

Provenance

The following attestation bundles were made for duckdb_engine-0.17.0.tar.gz:

Publisher: publish.yaml on Mause/duckdb_engine

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: duckdb_engine-0.17.0.tar.gz
- Subject digest: 396b23869754e536aa80881a92622b8b488015cf711c5a40032d05d2cf08f3cf
- Sigstore transparency entry: 189725215
- Sigstore integration time: Mar 29, 2025
Source repository:
- Permalink: Mause/duckdb_engine@9d804a0189d636c83d1b74552eb625f004b06001
- Branch / Tag: refs/tags/v0.17.0
- Owner: https://github.com/Mause
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yaml@9d804a0189d636c83d1b74552eb625f004b06001
- Trigger Event: release

File details

Details for the file duckdb_engine-0.17.0-py3-none-any.whl.

File metadata

Download URL: duckdb_engine-0.17.0-py3-none-any.whl
Upload date: Mar 29, 2025
Size: 49.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for duckdb_engine-0.17.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3aa72085e536b43faab635f487baf77ddc5750069c16a2f8d9c6c3cb6083e979`
MD5	`cbfbd05e3a2b9003b4cef472c621b1a4`
BLAKE2b-256	`2aa2e90242f53f7ae41554419b1695b4820b364df87c8350aa420b60b20cab92`

See more details on using hashes here.

Provenance

The following attestation bundles were made for duckdb_engine-0.17.0-py3-none-any.whl:

Publisher: publish.yaml on Mause/duckdb_engine

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: duckdb_engine-0.17.0-py3-none-any.whl
- Subject digest: 3aa72085e536b43faab635f487baf77ddc5750069c16a2f8d9c6c3cb6083e979
- Sigstore transparency entry: 189725216
- Sigstore integration time: Mar 29, 2025
Source repository:
- Permalink: Mause/duckdb_engine@9d804a0189d636c83d1b74552eb625f004b06001
- Branch / Tag: refs/tags/v0.17.0
- Owner: https://github.com/Mause
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yaml@9d804a0189d636c83d1b74552eb625f004b06001
- Trigger Event: release

duckdb-engine 0.17.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

duckdb_engine

Installation

Usage

Usage in IPython/Jupyter

Configuration

How to register a pandas DataFrame

Things to keep in mind

Auto-incrementing ID columns

Pandas read_sql() chunksize

Unsigned integer support

Alembic Integration

Preloading extensions (experimental)

Registering Filesystems

The name

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

Pandas `read_sql()` chunksize