Skip to main content

Sqlalchemy adapter for Databend

Project description

databend-sqlalchemy

Databend dialect for SQLAlchemy.

Installation

The package is installable through PIP::

pip install databend-sqlalchemy

Usage

The DSN format is similar to that of regular Postgres::

    from sqlalchemy import create_engine, text
    from sqlalchemy.engine.base import Connection, Engine
    engine = create_engine(
        f"databend://{username}:{password}@{host_port_name}/{database_name}?sslmode=disable"
    )
    connection = engine.connect()
    result = connection.execute(text("SELECT 1"))
    assert len(result.fetchall()) == 1

    import connector
    cursor = connector.connect('databend://root:@localhost:8000?sslmode=disable').cursor()
    cursor.execute('SELECT * FROM test')
    # print(cursor.fetchone())
    # print(cursor.fetchall())
    for row in cursor:
        print(row)

Merge Command Support

Databend SQLAlchemy supports upserts via its Merge custom expression. See Merge for full documentation.

The Merge command can be used as below::

    from sqlalchemy.orm import sessionmaker
    from sqlalchemy import MetaData, create_engine
    from databend_sqlalchemy.databend_dialect import Merge

    engine = create_engine(db.url, echo=False)
    session = sessionmaker(bind=engine)()
    connection = engine.connect()

    meta = MetaData()
    meta.reflect(bind=session.bind)
    t1 = meta.tables['t1']
    t2 = meta.tables['t2']

    merge = Merge(target=t1, source=t2, on=t1.c.t1key == t2.c.t2key)
    merge.when_matched_then_delete().where(t2.c.marked == 1)
    merge.when_matched_then_update().where(t2.c.isnewstatus == 1).values(val = t2.c.newval, status=t2.c.newstatus)
    merge.when_matched_then_update().values(val=t2.c.newval)
    merge.when_not_matched_then_insert().values(val=t2.c.newval, status=t2.c.newstatus)
    connection.execute(merge)

Copy Into Command Support

Databend SQLAlchemy supports copy into operations through it's CopyIntoTable and CopyIntoLocation methods See CopyIntoLocation or CopyIntoTable for full documentation.

The CopyIntoTable command can be used as below::

    from sqlalchemy.orm import sessionmaker
    from sqlalchemy import MetaData, create_engine
    from databend_sqlalchemy import (
        CopyIntoTable, GoogleCloudStorage, ParquetFormat, CopyIntoTableOptions,
        FileColumnClause, CSVFormat,
    )

    engine = create_engine(db.url, echo=False)
    session = sessionmaker(bind=engine)()
    connection = engine.connect()

    meta = MetaData()
    meta.reflect(bind=session.bind)
    t1 = meta.tables['t1']
    t2 = meta.tables['t2']
    gcs_private_key = 'full_gcs_json_private_key'
    case_sensitive_columns = True

    copy_into = CopyIntoTable(
        target=t1,
        from_=GoogleCloudStorage(
            uri='gcs://bucket-name/path/to/file',
            credentials=base64.b64encode(gcs_private_key.encode()).decode(),
        ),
        file_format=ParquetFormat(),
        options=CopyIntoTableOptions(
            force=True,
            column_match_mode='CASE_SENSITIVE' if case_sensitive_columns else None,
        )
    )
    result = connection.execute(copy_into)
    result.fetchall()  # always call fetchall() to ensure the cursor executes to completion

    # More involved example with column selection clause that can be altered to perform operations on the columns during import.

    copy_into = CopyIntoTable(
        target=t2,
        from_=FileColumnClause(
            columns=', '.join([
                f'${index + 1}'
                for index, column in enumerate(t2.columns)
            ]),
            from_=GoogleCloudStorage(
                uri='gcs://bucket-name/path/to/file',
                credentials=base64.b64encode(gcs_private_key.encode()).decode(),
            )
        ),
        pattern='*.*',
        file_format=CSVFormat(
            record_delimiter='\n',
            field_delimiter=',',
            quote='"',
            escape='',
            skip_header=1,
            empty_field_as='NULL',
            compression=Compression.AUTO,
        ),
        options=CopyIntoTableOptions(
            force=True,
        )
    )
    result = connection.execute(copy_into)
    result.fetchall()  # always call fetchall() to ensure the cursor executes to completion

The CopyIntoLocation command can be used as below::

    from sqlalchemy.orm import sessionmaker
    from sqlalchemy import MetaData, create_engine
    from databend_sqlalchemy import (
        CopyIntoLocation, GoogleCloudStorage, ParquetFormat, CopyIntoLocationOptions,
    )

    engine = create_engine(db.url, echo=False)
    session = sessionmaker(bind=engine)()
    connection = engine.connect()

    meta = MetaData()
    meta.reflect(bind=session.bind)
    t1 = meta.tables['t1']
    gcs_private_key = 'full_gcs_json_private_key'

    copy_into = CopyIntoLocation(
        target=GoogleCloudStorage(
            uri='gcs://bucket-name/path/to/target_file',
            credentials=base64.b64encode(gcs_private_key.encode()).decode(),
        ),
        from_=select(t1).where(t1.c['col1'] == 1),
        file_format=ParquetFormat(),
        options=CopyIntoLocationOptions(
            single=True,
            overwrite=True,
            include_query_id=False,
            use_raw_path=True,
        )
    )
    result = connection.execute(copy_into)
    result.fetchall()  # always call fetchall() to ensure the cursor executes to completion

Table Options

Databend SQLAlchemy supports databend specific table options for Engine, Cluster Keys and Transient tables

The table options can be used as below::

    from sqlalchemy import Table, Column
    from sqlalchemy import MetaData, create_engine

    engine = create_engine(db.url, echo=False)

    meta = MetaData()
    # Example of Transient Table
    t_transient = Table(
        "t_transient",
        meta,
        Column("c1", Integer),
        databend_transient=True,
    )

    # Example of Engine
    t_engine = Table(
        "t_engine",
        meta,
        Column("c1", Integer),
        databend_engine='Memory',
    )

    # Examples of Table with Cluster Keys
    t_cluster_1 = Table(
        "t_cluster_1",
        meta,
        Column("c1", Integer),
        databend_cluster_by=[c1],
    )
    #
    c = Column("id", Integer)
    c2 = Column("Name", String)
    t_cluster_2 = Table(
        't_cluster_2',
        meta,
        c,
        c2,
        databend_cluster_by=[cast(c, String), c2],
    )

    meta.create_all(engine)

Compatibility

  • If databend version >= v0.9.0 or later, you need to use databend-sqlalchemy version >= v0.1.0.
  • The databend-sqlalchemy use databend-py as internal driver when version < v0.4.0, but when version >= v0.4.0 it use databend driver python binding as internal driver. The only difference between the two is that the connection parameters provided in the DSN are different. When using the corresponding version, you should refer to the connection parameters provided by the corresponding Driver.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

databend_sqlalchemy-0.5.4.tar.gz (41.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

databend_sqlalchemy-0.5.4-py3-none-any.whl (33.5 kB view details)

Uploaded Python 3

File details

Details for the file databend_sqlalchemy-0.5.4.tar.gz.

File metadata

  • Download URL: databend_sqlalchemy-0.5.4.tar.gz
  • Upload date:
  • Size: 41.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.4

File hashes

Hashes for databend_sqlalchemy-0.5.4.tar.gz
Algorithm Hash digest
SHA256 22038f5d1332651a11fad9c313773786cc547927e37c8938e796850a714aa51c
MD5 6a3a90e52ceb4f36d642eaddf27a086d
BLAKE2b-256 1447b479bd37e5087eac9547c7b265468c47ac679dc2444cd97347a9751be29f

See more details on using hashes here.

File details

Details for the file databend_sqlalchemy-0.5.4-py3-none-any.whl.

File metadata

File hashes

Hashes for databend_sqlalchemy-0.5.4-py3-none-any.whl
Algorithm Hash digest
SHA256 f383b85d9bc2f5f82d103e596f589fe98ceaf3f831e9e04e8b938df871cea0f3
MD5 6165728abf935530ad279d2e236ea0bd
BLAKE2b-256 6b25527a8fda523ac7cd78e7e8fb072444d06b43070658cda20232348f5a1528

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page