Skip to main content

Read the data of an ODBC data source as sequence of Apache Arrow record batches.

Project description

arrow-odbc-py

Licence PyPI version

Fill Apache Arrow arrays from ODBC data sources. This crate is build on top of the pyarrow Python package and arrow-odbc Rust crate and enables you to read the data of an ODBC data source as sequence of Apache Arrow record batches.

Users looking for a mature solution for bulk fetching data from ODBC data sources in Python should also take a look at turbodbc which has a helpful community and seen a lot more battle testing than this. Also this Python package is more narrow in Scope (which is a fancy way of saying it has less features), as it is only concerned with bulk fetching Arrow Arrays and nothing else.

About Arrow

Apache Arrow defines a language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware like CPUs and GPUs. The Arrow memory format also supports zero-copy reads for lightning-fast data access without serialization overhead.

About ODBC

ODBC (Open DataBase Connectivity) is a standard which enables you to access data from a wide variaty of data sources using SQL.

Usage

from arrow_odbc import read_arrow_batches_from_odbc

connection_string="Driver={ODBC Driver 17 for SQL Server};Server=localhost;UID=SA;PWD=My@Test@Password1;"
query = f"SELECT * FROM MyTable"

reader = read_arrow_batches_from_odbc(
    query=query, batch_size=1000, connection_string=connection_string
)

for batch in reader:
    # Process arrow batches
    df = batch.to_pandas()
    # ...

Installation

Wheels have been uploaded to PyPi and can be installed using pip.

pip install arrow-odbc

arrow-odbc utilizes cffi and the Arrow C-Interface to glue Rust and Python code together. Therefore the wheel does not need to be build against the precise version either of Python or Arrow.

To build from source you need to install the Rust toolchain. Installation instruction can be found here: https://www.rust-lang.org/tools/install

Installing ODBC driver manager

The provided wheels dynamically link against the driver manager, which must be provided by the system.

Windows

Nothing to do. ODBC driver manager is preinstalled.

Ubuntu

sudo apt-get install unixodbc-dev

To build the wheel from source you need

sudo apt-get install unixodbc-dev

OS-X

You can use homebrew to install UnixODBC

brew install unixodbc

Matching of ODBC to Arrow types

ODBC Arrow
Numeric(p <= 38) Decimal
Decimal(p <= 38) Decimal
Integer Int32
SmallInt Int16
Real Float32
Float(p <=24) Float32
Double Float64
Float(p > 24) Float64
Date Date32
LongVarbinary Binary
Timestamp(p = 0) TimestampSecond
Timestamp(p: 1..3) TimestampMilliSecond
Timestamp(p: 4..6) TimestampMicroSecond
Timestamp(p >= 7 ) TimestampNanoSecond
BigInt Int64
TinyInt Int8
Bit Boolean
Varbinary Binary
Binary FixedSizedBinary
All others Utf8

Project details


Release history Release notifications | RSS feed

This version

0.1.9

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

arrow-odbc-0.1.9.tar.gz (19.7 kB view details)

Uploaded Source

Built Distributions

arrow_odbc-0.1.9-py3-none-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (674.7 kB view details)

Uploaded Python 3 manylinux: glibc 2.12+ x86-64

arrow_odbc-0.1.9-py2.py3-none-win_amd64.whl (243.4 kB view details)

Uploaded Python 2 Python 3 Windows x86-64

arrow_odbc-0.1.9-py2.py3-none-macosx_11_0_x86_64.whl (360.4 kB view details)

Uploaded Python 2 Python 3 macOS 11.0+ x86-64

File details

Details for the file arrow-odbc-0.1.9.tar.gz.

File metadata

  • Download URL: arrow-odbc-0.1.9.tar.gz
  • Upload date:
  • Size: 19.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.0 importlib_metadata/4.8.2 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.0

File hashes

Hashes for arrow-odbc-0.1.9.tar.gz
Algorithm Hash digest
SHA256 7c68f4bebfa6e1ad394760071a2cd19d63e79e1b740348cfc0483c4c4a369162
MD5 a446cde119cdac89e4b3c804793e815f
BLAKE2b-256 3f0562aac65cab812c7ff11f3eafda76ed6c29711eaf32871c765cde9e9ce959

See more details on using hashes here.

File details

Details for the file arrow_odbc-0.1.9-py3-none-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for arrow_odbc-0.1.9-py3-none-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 a1c98baa9577baefd56bb8439c6f82fcb493c933ed7ec5f6b1fe8aad52292578
MD5 be66aaf824bf9032f6b72eb5cdd015b0
BLAKE2b-256 ca63058d67ccbc39a91dfb57119e330d88e7dd74a585dd5442f2c3d8387c496f

See more details on using hashes here.

File details

Details for the file arrow_odbc-0.1.9-py2.py3-none-win_amd64.whl.

File metadata

  • Download URL: arrow_odbc-0.1.9-py2.py3-none-win_amd64.whl
  • Upload date:
  • Size: 243.4 kB
  • Tags: Python 2, Python 3, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.5.0 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.0

File hashes

Hashes for arrow_odbc-0.1.9-py2.py3-none-win_amd64.whl
Algorithm Hash digest
SHA256 369635b43444e81342de6893327781be5cecb2a009260a5be483044998551453
MD5 5c9e8b6f0f6493a7ba18fd914e34aa7b
BLAKE2b-256 203147bd11c9cba0e460971453b545869404ad3867edabbb78ebca1f5f402a45

See more details on using hashes here.

File details

Details for the file arrow_odbc-0.1.9-py2.py3-none-macosx_11_0_x86_64.whl.

File metadata

  • Download URL: arrow_odbc-0.1.9-py2.py3-none-macosx_11_0_x86_64.whl
  • Upload date:
  • Size: 360.4 kB
  • Tags: Python 2, Python 3, macOS 11.0+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.0 importlib_metadata/4.8.2 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.0

File hashes

Hashes for arrow_odbc-0.1.9-py2.py3-none-macosx_11_0_x86_64.whl
Algorithm Hash digest
SHA256 ad752a6026d7403d528e69187fdb076be435f706f2822a09f3391182ecafc15f
MD5 9ec250d5d5284ade1474efcade394be7
BLAKE2b-256 d189635a17a215db4c901d3e96b3a62760e042017722b589d6467ae169b4f81f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page