Skip to main content

Arrow, pydantic style

Project description

Welcome to arrowdantic

Arrowdantic is a small Python library backed by a mature Rust implementation of Apache Arrow that can interoperate with

For simple (but data-heavy) data engineering tasks, this package essentially replaces pyarrow: it supports reading from and writing to Parquet, Arrow at the same or higher performance and higher safety (e.g. no segfaults).

Furthermore, it supports reading from and writing to ODBC compliant databases at the same or higher performance than turbodbc.

This package is particularly suitable for environments such as AWS Lambda - it takes 8M of disk space, compared to 82M taken by pyarrow.

Features

  • declare and access Arrow-backed arrays (integers, floats, boolean, string, binary)
  • read from and write to Apache Arrow IPC file
  • read from and write to Apache Parquet
  • read from and write to ODBC-compliant databases (e.g. postgres, mongoDB)

Examples

Use parquet

import io
import arrowdantic as ad

original_arrays = [ad.UInt32Array([1, None])]

schema = ad.Schema(
    [ad.Field(f"c{i}", array.type, True) for i, array in enumerate(original_arrays)]
)

data = io.BytesIO()
with ad.ParquetFileWriter(data, schema) as writer:
    writer.write(ad.Chunk(original_arrays))
data.seek(0)

reader = ad.ParquetFileReader(data)
chunk = next(reader)
assert chunk.arrays() == original_arrays

Use Arrow files

import arrowdantic as ad

original_arrays = [ad.UInt32Array([1, None])]

schema = ad.Schema(
    [ad.Field(f"c{i}", array.type, True) for i, array in enumerate(original_arrays)]
)

import io

data = io.BytesIO()
with ad.ArrowFileWriter(data, schema) as writer:
    writer.write(ad.Chunk(original_arrays))
data.seek(0)

reader = ad.ArrowFileReader(data)
chunk = next(reader)
assert chunk.arrays() == original_arrays

Use ODBC

import arrowdantic as ad


arrays = [ad.Int32Array([1, None]), ad.StringArray(["aa", None])]

with ad.ODBCConnector(r"Driver={SQLite3};Database=sqlite-test.db") as con:
    # create an empty table with a schema
    con.execute("DROP TABLE IF EXISTS example;")
    con.execute("CREATE TABLE example (c1 INT, c2 TEXT);")

    # insert the arrays
    con.write("INSERT INTO example (c1, c2) VALUES (?, ?)", ad.Chunk(arrays))

    # read the arrays
    with con.execute("SELECT c1, c2 FROM example", 1024) as chunks:
        assert chunks.fields() == [
            ad.Field("c1", ad.DataType.int32(), True),
            ad.Field("c2", ad.DataType.string(), True),
        ]
        chunk = next(chunks)
assert chunk.arrays() == arrays

Use timezones

This package fully supports datetime and conversions between them and arrow:

import arrowdantic as ad


dt = datetime.datetime(
    year=2021,
    month=1,
    day=1,
    hour=1,
    minute=1,
    second=1,
    microsecond=1,
    tzinfo=datetime.timezone.utc,
)
a = ad.TimestampArray([dt, None])
assert (
    str(a)
    == 'Timestamp(Microsecond, Some("+00:00"))[2021-01-01 01:01:01.000001 +00:00, None]'
)
assert list(a) == [dt, None]
assert a.type == ad.DataType.timestamp(datetime.timezone.utc)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

arrowdantic-0.2.3-cp310-none-win_amd64.whl (2.7 MB view details)

Uploaded CPython 3.10 Windows x86-64

arrowdantic-0.2.3-cp310-cp310-macosx_10_7_x86_64.whl (3.0 MB view details)

Uploaded CPython 3.10 macOS 10.7+ x86-64

arrowdantic-0.2.3-cp39-none-win_amd64.whl (2.7 MB view details)

Uploaded CPython 3.9 Windows x86-64

arrowdantic-0.2.3-cp39-cp39-macosx_10_7_x86_64.whl (3.0 MB view details)

Uploaded CPython 3.9 macOS 10.7+ x86-64

arrowdantic-0.2.3-cp38-none-win_amd64.whl (2.7 MB view details)

Uploaded CPython 3.8 Windows x86-64

arrowdantic-0.2.3-cp38-cp38-macosx_10_7_x86_64.whl (3.0 MB view details)

Uploaded CPython 3.8 macOS 10.7+ x86-64

arrowdantic-0.2.3-cp37-none-win_amd64.whl (2.7 MB view details)

Uploaded CPython 3.7 Windows x86-64

arrowdantic-0.2.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.2 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64

arrowdantic-0.2.3-cp37-cp37m-macosx_10_7_x86_64.whl (3.0 MB view details)

Uploaded CPython 3.7m macOS 10.7+ x86-64

File details

Details for the file arrowdantic-0.2.3-cp310-none-win_amd64.whl.

File metadata

File hashes

Hashes for arrowdantic-0.2.3-cp310-none-win_amd64.whl
Algorithm Hash digest
SHA256 21cba30e43a119d472c1e2e84332caf24540b8eeddc49188184cbb42b34090a1
MD5 c0f9c9bec2202c6d26f06fa4ea786d05
BLAKE2b-256 2d3b5435a5d590ebbad6d03aa10e43f0ddf80fc6aa176c023e8992f09d72df1c

See more details on using hashes here.

File details

Details for the file arrowdantic-0.2.3-cp310-cp310-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for arrowdantic-0.2.3-cp310-cp310-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 f211b8bd5262d5bd8be098f8a2ca43b7fcdd17cedb810e2f3cf1a2773bf89273
MD5 5fefdd79a8b1fb1ee20183ea494e3218
BLAKE2b-256 fdb7301ec72c4f2f9d1180b9d2d4a40a06b31ef4e90b7e175725a9ff3df74d4d

See more details on using hashes here.

File details

Details for the file arrowdantic-0.2.3-cp39-none-win_amd64.whl.

File metadata

File hashes

Hashes for arrowdantic-0.2.3-cp39-none-win_amd64.whl
Algorithm Hash digest
SHA256 25346892df70a76d3ca0ee58e91afc63d3568bdb4f70407cd3ba2c3c8ec9a40a
MD5 663bef6b30ff3f4e871f8beb0317d067
BLAKE2b-256 5dea3fdc0c5acb9cca8aba5d013e6a4baa95dc75a86bcbccaa591f4af93e1936

See more details on using hashes here.

File details

Details for the file arrowdantic-0.2.3-cp39-cp39-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for arrowdantic-0.2.3-cp39-cp39-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 711dbcec7cd2bf2b727b4369f4b34270d7a0707d8da96f629600de7a677ae944
MD5 4a8bfa638e0dba3bfec6b5ea677f5d75
BLAKE2b-256 c5a438f1d055cd306da2fa2da6276aca4497d8a5f738a193dd0f5b4bf000be82

See more details on using hashes here.

File details

Details for the file arrowdantic-0.2.3-cp38-none-win_amd64.whl.

File metadata

File hashes

Hashes for arrowdantic-0.2.3-cp38-none-win_amd64.whl
Algorithm Hash digest
SHA256 5b28eaf9f1591662a46751123bc79c0f05dd810f8c5b70720ecc63a308a49283
MD5 6d34adc966c9b9d46448ee8b2e008d69
BLAKE2b-256 f322c23300f7040e5b589232763aaa71572165e15700aadab6b1c6ebdb7bdb06

See more details on using hashes here.

File details

Details for the file arrowdantic-0.2.3-cp38-cp38-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for arrowdantic-0.2.3-cp38-cp38-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 eda287dac0104e60bb4b246b14f8f215756ef3a8ec4007f2a0919b2157ecf755
MD5 79b93073a60a2b0f212955bb121b1966
BLAKE2b-256 f8f277bb5ceee722fe26ed7ae1bf75c83c6dbaa5e0135f50ba3aeeae27c71e00

See more details on using hashes here.

File details

Details for the file arrowdantic-0.2.3-cp37-none-win_amd64.whl.

File metadata

File hashes

Hashes for arrowdantic-0.2.3-cp37-none-win_amd64.whl
Algorithm Hash digest
SHA256 f5167d8d7910fe1965c383a9b0d8f9d0e8d30f58c4f955cdd4b6cc281a763ef0
MD5 93bb53a6da18d5793ab96f808203a075
BLAKE2b-256 dc8c7837e881c665053a7dd7dd29594ce8ef52ce85ee686c8ab5a4e6e78b6178

See more details on using hashes here.

File details

Details for the file arrowdantic-0.2.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for arrowdantic-0.2.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 12644b44e6e29d4ef835cc717d2f9e6bf7d645b12888cf4b9420eb2794099847
MD5 7c1479f8aeab8b122b92b0ecccd3fb01
BLAKE2b-256 3612c31c047833de18fc2f2f7bfdf2ce52217ef0df68c72c5e140235f9949261

See more details on using hashes here.

File details

Details for the file arrowdantic-0.2.3-cp37-cp37m-macosx_10_7_x86_64.whl.

File metadata

File hashes

Hashes for arrowdantic-0.2.3-cp37-cp37m-macosx_10_7_x86_64.whl
Algorithm Hash digest
SHA256 06fe232564ad73fb09b3a06461e3e64b12304eac0e2fa7de5562852d34ea8db5
MD5 1c2f4c82626f5b997160b2a1e9fd7f42
BLAKE2b-256 64a9b3392c796e38b20823d9cb0998c3937549bfcafbffcfc93adfdc6423173b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page