Skip to main content

Python client for a DataPress dataset server, backed by the native Rust client (datapress-client).

Project description

datap-rs-client

Python client for a DataPress dataset server, backed by the native Rust client (datapress-client) via PyO3.

Requests are plain Python dicts; responses come back as dicts. Structured queries can optionally be decoded into a pyarrow.Table.

Install

pip install datap-rs-client          # core
pip install datap-rs-client[arrow]   # + pyarrow for query_arrow()

This project standardises on uv: uv pip install datap-rs-client[arrow].

Usage

from datap_rs_client import DataPressClient

client = DataPressClient("http://127.0.0.1:8000")

client.datasets()
# ['accidents']

client.count("accidents", predicates=[{"col": "Severity", "op": "gte", "val": 3}])
# 123456

rows = client.query(
    "accidents",
    columns=["State", "Severity"],
    predicates=[{"col": "Severity", "op": "gte", "val": 3}],
    page_size=1000,
)
rows["page"], len(rows["data"])
# (1, 1000)

# Arrow (requires the [arrow] extra)
table = client.query_arrow("accidents", columns=["State", "Severity"], page_size=100_000)
table.num_rows

Authentication

client = DataPressClient(
    "http://127.0.0.1:8000",
    bearer_token="…",     # servers with auth enabled
    admin_token="…",      # required by reload()
)

SQL

client.sql("SELECT State, COUNT(*) AS n FROM accidents GROUP BY State", max_rows=100)

DataFrames

query_arrow(...) returns a pyarrow.Table (install the [arrow] extra). Arrow is the zero-copy interchange format for every popular dataframe library, so a single query feeds them all:

from datap_rs_client import DataPressClient

client = DataPressClient("http://127.0.0.1:8000")
table = client.query_arrow(
    "accidents",
    columns=["State", "Severity"],
    predicates=[{"col": "Severity", "op": "gte", "val": 3}],
    page_size=1_000_000,
)

Polars

import polars as pl

# Zero-copy from the Arrow table.
df = pl.from_arrow(table)
df.group_by("State").len().sort("len", descending=True)

pandas

import pandas as pd  # noqa: F401  (pyarrow drives the conversion)

# Arrow-backed dtypes (recommended) …
df = table.to_pandas(types_mapper=pd.ArrowDtype)
# … or classic NumPy-backed dtypes:
df = table.to_pandas()
df.groupby("State")["Severity"].mean()

DuckDB

import duckdb

# DuckDB queries the Arrow table in place — no copy, no temp files.
duckdb.sql("SELECT State, COUNT(*) AS n FROM table GROUP BY State ORDER BY n DESC")

PySpark

from pyspark.sql import SparkSession

spark = SparkSession.builder.getOrCreate()
# Spark has no direct Arrow-table constructor; go via pandas (Arrow-accelerated).
sdf = spark.createDataFrame(table.to_pandas())
sdf.groupBy("State").count().orderBy("count", ascending=False).show()

DataFusion

from datafusion import SessionContext

ctx = SessionContext()
df = ctx.from_arrow(table)
df.aggregate([df["State"]], [df["Severity"].mean()])

PyArrow / Arrow ecosystem

# The result is already a pyarrow.Table.
table.column("Severity").combine_chunks()
table.to_batches()           # -> list[pyarrow.RecordBatch]
table.to_pydict()            # -> dict[str, list]

Anything implementing the Arrow C Data Interface (Polars, DuckDB, DataFusion, Vaex, cuDF, …) can consume the table directly. For libraries without an Arrow constructor, table.to_pandas() is the universal fallback.

Relationship to datap-rs

datap-rs ships the server; datap-rs-client is a standalone client. They are independent packages — install whichever you need.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datap_rs_client-0.4.21.tar.gz (69.4 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

datap_rs_client-0.4.21-cp39-abi3-win_amd64.whl (1.9 MB view details)

Uploaded CPython 3.9+Windows x86-64

datap_rs_client-0.4.21-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.0 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ x86-64

datap_rs_client-0.4.21-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.9 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ ARM64

datap_rs_client-0.4.21-cp39-abi3-macosx_11_0_arm64.whl (1.8 MB view details)

Uploaded CPython 3.9+macOS 11.0+ ARM64

File details

Details for the file datap_rs_client-0.4.21.tar.gz.

File metadata

  • Download URL: datap_rs_client-0.4.21.tar.gz
  • Upload date:
  • Size: 69.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for datap_rs_client-0.4.21.tar.gz
Algorithm Hash digest
SHA256 d14516f3451490916b25dda10d6c6b984b2e9e62433214ac0cea730afeefba03
MD5 b1694c11d68ab71fe58e51c785e050c3
BLAKE2b-256 dae04f89fb61e2a62299f52570a37607e4937847e35b16a7fb3b9226f4451564

See more details on using hashes here.

File details

Details for the file datap_rs_client-0.4.21-cp39-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for datap_rs_client-0.4.21-cp39-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 e9444845522c0933d1ba3e62d86c2aee84ae5b373f2be29b3757cd4e2e30cb9b
MD5 600d85fe612265e3ca98daf336d9a4d2
BLAKE2b-256 503c445c46ee7b46878901068807a4cd527540713d7a884c4785c45483dda605

See more details on using hashes here.

File details

Details for the file datap_rs_client-0.4.21-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for datap_rs_client-0.4.21-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e468c15185852ecaaae7501e78a7ec15ead6e8e7853b4d200a4399fa8c11cbe4
MD5 63449d26fd1c91eb5ab4df2d1afffca7
BLAKE2b-256 f1530058ee53c6a1fa991efba44c1a554fdb176e8e9113aeb13c6e994baaf1da

See more details on using hashes here.

File details

Details for the file datap_rs_client-0.4.21-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for datap_rs_client-0.4.21-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 1997febdc557f26cd00e7add7e30e4cb62062c59eeaccb9e31dadcd02200f461
MD5 e9577d1e18251790cafd1b7823054457
BLAKE2b-256 320b2d47e7897c273b6107d74c4796520209d22eebd8871652bf76f54cce6dc1

See more details on using hashes here.

File details

Details for the file datap_rs_client-0.4.21-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for datap_rs_client-0.4.21-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 58b5cf08b5cdc70233d48f5c150c8c21c5e5a1df2a2b9395b309f07cad592638
MD5 abc7370a29783b41f7f0de5ab5bc2395
BLAKE2b-256 5912882ebadaf1ebcfc00e80d88ff35f77546b14cfd50890cae065b37381d5b6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page