Skip to main content

Python client for a DataPress dataset server, backed by the native Rust client (datapress-client).

Project description

datap-rs-client

Python client for a DataPress dataset server, backed by the native Rust client (datapress-client) via PyO3.

Requests are plain Python dicts; responses come back as dicts. Structured queries can optionally be decoded into a pyarrow.Table.

Install

pip install datap-rs-client          # core
pip install datap-rs-client[arrow]   # + pyarrow for query_arrow()

This project standardises on uv: uv pip install datap-rs-client[arrow].

Usage

from datap_rs_client import DataPressClient

client = DataPressClient("http://127.0.0.1:8000")

client.datasets()
# ['accidents']

client.count("accidents", predicates=[{"col": "Severity", "op": "gte", "val": 3}])
# 123456

rows = client.query(
    "accidents",
    columns=["State", "Severity"],
    predicates=[{"col": "Severity", "op": "gte", "val": 3}],
    page_size=1000,
)
rows["page"], len(rows["data"])
# (1, 1000)

# Arrow (requires the [arrow] extra)
table = client.query_arrow("accidents", columns=["State", "Severity"], page_size=100_000)
table.num_rows

Authentication

client = DataPressClient(
    "http://127.0.0.1:8000",
    bearer_token="…",     # servers with auth enabled
    admin_token="…",      # required by reload()
)

SQL

client.sql("SELECT State, COUNT(*) AS n FROM accidents GROUP BY State", max_rows=100)

DataFrames

query_arrow(...) returns a pyarrow.Table (install the [arrow] extra). Arrow is the zero-copy interchange format for every popular dataframe library, so a single query feeds them all:

from datap_rs_client import DataPressClient

client = DataPressClient("http://127.0.0.1:8000")
table = client.query_arrow(
    "accidents",
    columns=["State", "Severity"],
    predicates=[{"col": "Severity", "op": "gte", "val": 3}],
    page_size=1_000_000,
)

Polars

import polars as pl

# Zero-copy from the Arrow table.
df = pl.from_arrow(table)
df.group_by("State").len().sort("len", descending=True)

pandas

import pandas as pd  # noqa: F401  (pyarrow drives the conversion)

# Arrow-backed dtypes (recommended) …
df = table.to_pandas(types_mapper=pd.ArrowDtype)
# … or classic NumPy-backed dtypes:
df = table.to_pandas()
df.groupby("State")["Severity"].mean()

DuckDB

import duckdb

# DuckDB queries the Arrow table in place — no copy, no temp files.
duckdb.sql("SELECT State, COUNT(*) AS n FROM table GROUP BY State ORDER BY n DESC")

PySpark

from pyspark.sql import SparkSession

spark = SparkSession.builder.getOrCreate()
# Spark has no direct Arrow-table constructor; go via pandas (Arrow-accelerated).
sdf = spark.createDataFrame(table.to_pandas())
sdf.groupBy("State").count().orderBy("count", ascending=False).show()

DataFusion

from datafusion import SessionContext

ctx = SessionContext()
df = ctx.from_arrow(table)
df.aggregate([df["State"]], [df["Severity"].mean()])

PyArrow / Arrow ecosystem

# The result is already a pyarrow.Table.
table.column("Severity").combine_chunks()
table.to_batches()           # -> list[pyarrow.RecordBatch]
table.to_pydict()            # -> dict[str, list]

Anything implementing the Arrow C Data Interface (Polars, DuckDB, DataFusion, Vaex, cuDF, …) can consume the table directly. For libraries without an Arrow constructor, table.to_pandas() is the universal fallback.

Relationship to datap-rs

datap-rs ships the server; datap-rs-client is a standalone client. They are independent packages — install whichever you need.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datap_rs_client-0.4.22.tar.gz (69.4 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

datap_rs_client-0.4.22-cp39-abi3-win_amd64.whl (1.9 MB view details)

Uploaded CPython 3.9+Windows x86-64

datap_rs_client-0.4.22-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.0 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ x86-64

datap_rs_client-0.4.22-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.9 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ ARM64

datap_rs_client-0.4.22-cp39-abi3-macosx_11_0_arm64.whl (1.8 MB view details)

Uploaded CPython 3.9+macOS 11.0+ ARM64

File details

Details for the file datap_rs_client-0.4.22.tar.gz.

File metadata

  • Download URL: datap_rs_client-0.4.22.tar.gz
  • Upload date:
  • Size: 69.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for datap_rs_client-0.4.22.tar.gz
Algorithm Hash digest
SHA256 9665f71da59710557604e9cc63cd5b05b84d747c33996005ae570c53cffc862f
MD5 c9a0762cc0e4e4a15b03e1866db2a259
BLAKE2b-256 52d14bac58f4dc1197c960c51da93a0215a1c884e4e6944ed4ae151aa9cbc4c3

See more details on using hashes here.

File details

Details for the file datap_rs_client-0.4.22-cp39-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for datap_rs_client-0.4.22-cp39-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 5f5a97aa5c4fd5862aeaa64712239dd99bdb9e4ba60d4e89e6834110630f66d3
MD5 c14523cc42f4272632b35933ac24e544
BLAKE2b-256 f961195bf9e9ae1da1495ac5fdb637d9dff510b3fdc7a51506e5170ca7cdae43

See more details on using hashes here.

File details

Details for the file datap_rs_client-0.4.22-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for datap_rs_client-0.4.22-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 021b27d0bf3103ed44027f2e7b93c6f58d802512ef29c4b14bf5d91c41a88521
MD5 82d2adeb217381dcc9a959d07f258e30
BLAKE2b-256 8c38114c669d5e52c2fae8b3205192ed07068d9a6aa02147a7c107ccfa49a658

See more details on using hashes here.

File details

Details for the file datap_rs_client-0.4.22-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for datap_rs_client-0.4.22-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 81d165006f7458201578434bf5c17763783cc57e120cb97bbc81be08853878af
MD5 1b08e99b3835511fc884a90161d22828
BLAKE2b-256 336d18bb804a27fe8aea1f57f6716e9ec075b5064abf59cf17fcd7bb2c6039ee

See more details on using hashes here.

File details

Details for the file datap_rs_client-0.4.22-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for datap_rs_client-0.4.22-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 cdef89d1c8e2ee8cf09e814b8e2b83c3a368439c66c4ad1e56f0e328a17d3db3
MD5 f91fc1cb9dad1ac77c02a291edc4ddaa
BLAKE2b-256 128afb6110babae7d06475da24fa9ea71d2d115e6b1ca3fe98b72dc5a8b661ee

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page