Apache Arrow adapter for the Cassandra python driver
Project description
Cassarrow
Arrow based Cassandra python driver.
TLDR;
Speed up the cassandra python driver using C++ to parse cassandra queries data as Apache Arrow tables.
Key features:
- 20x speed up in the parsing of results
- 14x less memory
- Support for most native types, UDT, List and Set
Getting Started
Installation
pip install cassarrow
Usage
import cassarrow
import pyarrow as pa
# ...
with cassarrow.install_cassarrow(session) as cassarrow_session:
table: pa.Table = cassarrow.result_set_to_table(cassarrow_session.execute("SELECT * FROM my_table"))
Type Mapping
Native Types
Cassandra | pyarrow | Note |
---|---|---|
ascii | pa.string() |
|
bigint | pa.int64() |
|
blob | pa.binary() |
|
boolean | pa.bool_() |
|
counter | TODO | |
date | pa.date32() |
|
decimal | Incompatible | |
double | pa.float64() |
|
duration | pa.duration("ns") |
|
float | pa.float32() |
|
inet | TODO | |
int | pa.int32() |
|
smallint | pa.int16() |
|
text | pa.string() |
|
time | pa.time64("ns") |
|
timestamp | pa.timestamp("ms") |
|
timeuuid | pa.binary(16) |
|
tinyint | pa.int8() |
|
uuid | pa.binary(16) |
|
varchar | pa.string() |
|
varint | Incompatible |
Collections / UDT
Cassandra | pyarrow | Note |
---|---|---|
list | pa.list_ |
|
map | pa.map_ |
|
set | pa.list_ |
|
udt | pa.struct |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
cassarrow-0.2.0rc4.tar.gz
(14.7 kB
view details)
Built Distribution
File details
Details for the file cassarrow-0.2.0rc4.tar.gz
.
File metadata
- Download URL: cassarrow-0.2.0rc4.tar.gz
- Upload date:
- Size: 14.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.2.2 CPython/3.10.9 Linux/5.15.0-1024-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ef9dd630ee2edafa7f15e98157932e70f5d63347cf7a13b41327d2225868d1d7 |
|
MD5 | 71588f9e0ddc03a3734b389cb8a3e9e1 |
|
BLAKE2b-256 | be5d2b6ee394607786688320b004025930e191ee007dbbe7fa2d3b10a74899b8 |
Provenance
File details
Details for the file cassarrow-0.2.0rc4-cp310-cp310-manylinux_2_35_x86_64.whl
.
File metadata
- Download URL: cassarrow-0.2.0rc4-cp310-cp310-manylinux_2_35_x86_64.whl
- Upload date:
- Size: 120.2 kB
- Tags: CPython 3.10, manylinux: glibc 2.35+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.2.2 CPython/3.10.9 Linux/5.15.0-1024-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5747d495d7125dc1657dd4fee5111dfc2ed1f0161d4cd85955947f11b724ebeb |
|
MD5 | aa16c5698d276d0965e71545290736b1 |
|
BLAKE2b-256 | e2f2fb56ec68c64eb3869b64c317758163286bda35e130336aa0e9a3bb99cdc0 |