Skip to main content

Python client for a DataPress dataset server, backed by the native Rust client (datapress-client).

Project description

datap-rs-client

Python client for a DataPress dataset server, backed by the native Rust client (datapress-client) via PyO3.

Requests are plain Python dicts; responses come back as dicts. Structured queries can optionally be decoded into a pyarrow.Table.

Install

pip install datap-rs-client          # core
pip install datap-rs-client[arrow]   # + pyarrow for query_arrow()

This project standardises on uv: uv pip install datap-rs-client[arrow].

Usage

from datap_rs_client import DataPressClient

client = DataPressClient("http://127.0.0.1:8000")

client.datasets()
# ['accidents']

client.count("accidents", predicates=[{"col": "Severity", "op": "gte", "val": 3}])
# 123456

rows = client.query(
    "accidents",
    columns=["State", "Severity"],
    predicates=[{"col": "Severity", "op": "gte", "val": 3}],
    page_size=1000,
)
rows["page"], len(rows["data"])
# (1, 1000)

# Arrow (requires the [arrow] extra)
table = client.query_arrow("accidents", columns=["State", "Severity"], page_size=100_000)
table.num_rows

Authentication

client = DataPressClient(
    "http://127.0.0.1:8000",
    bearer_token="…",     # servers with auth enabled
    admin_token="…",      # required by reload()
)

SQL

client.sql("SELECT State, COUNT(*) AS n FROM accidents GROUP BY State", max_rows=100)

DataFrames

query_arrow(...) returns a pyarrow.Table (install the [arrow] extra). Arrow is the zero-copy interchange format for every popular dataframe library, so a single query feeds them all:

from datap_rs_client import DataPressClient

client = DataPressClient("http://127.0.0.1:8000")
table = client.query_arrow(
    "accidents",
    columns=["State", "Severity"],
    predicates=[{"col": "Severity", "op": "gte", "val": 3}],
    page_size=1_000_000,
)

Polars

import polars as pl

# Zero-copy from the Arrow table.
df = pl.from_arrow(table)
df.group_by("State").len().sort("len", descending=True)

pandas

import pandas as pd  # noqa: F401  (pyarrow drives the conversion)

# Arrow-backed dtypes (recommended) …
df = table.to_pandas(types_mapper=pd.ArrowDtype)
# … or classic NumPy-backed dtypes:
df = table.to_pandas()
df.groupby("State")["Severity"].mean()

DuckDB

import duckdb

# DuckDB queries the Arrow table in place — no copy, no temp files.
duckdb.sql("SELECT State, COUNT(*) AS n FROM table GROUP BY State ORDER BY n DESC")

PySpark

from pyspark.sql import SparkSession

spark = SparkSession.builder.getOrCreate()
# Spark has no direct Arrow-table constructor; go via pandas (Arrow-accelerated).
sdf = spark.createDataFrame(table.to_pandas())
sdf.groupBy("State").count().orderBy("count", ascending=False).show()

DataFusion

from datafusion import SessionContext

ctx = SessionContext()
df = ctx.from_arrow(table)
df.aggregate([df["State"]], [df["Severity"].mean()])

PyArrow / Arrow ecosystem

# The result is already a pyarrow.Table.
table.column("Severity").combine_chunks()
table.to_batches()           # -> list[pyarrow.RecordBatch]
table.to_pydict()            # -> dict[str, list]

Anything implementing the Arrow C Data Interface (Polars, DuckDB, DataFusion, Vaex, cuDF, …) can consume the table directly. For libraries without an Arrow constructor, table.to_pandas() is the universal fallback.

Relationship to datap-rs

datap-rs ships the server; datap-rs-client is a standalone client. They are independent packages — install whichever you need.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datap_rs_client-0.4.23.tar.gz (69.4 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

datap_rs_client-0.4.23-cp39-abi3-win_amd64.whl (1.9 MB view details)

Uploaded CPython 3.9+Windows x86-64

datap_rs_client-0.4.23-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.0 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ x86-64

datap_rs_client-0.4.23-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.9 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ ARM64

datap_rs_client-0.4.23-cp39-abi3-macosx_11_0_arm64.whl (1.8 MB view details)

Uploaded CPython 3.9+macOS 11.0+ ARM64

File details

Details for the file datap_rs_client-0.4.23.tar.gz.

File metadata

  • Download URL: datap_rs_client-0.4.23.tar.gz
  • Upload date:
  • Size: 69.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for datap_rs_client-0.4.23.tar.gz
Algorithm Hash digest
SHA256 972e3d8425782853479af1e44f37147b23c055a0bd6ec30cf142c278f69aca1e
MD5 688e286f750ca90010e9205960b13c8d
BLAKE2b-256 77b4e877fa12af288fc7d741936adc0b995e93d40a3932d86b8a68ce9ec91e02

See more details on using hashes here.

File details

Details for the file datap_rs_client-0.4.23-cp39-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for datap_rs_client-0.4.23-cp39-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 6b89e1f96550f12ca65965f073d0857d7166b8659bdf7b03f02fc00519840192
MD5 57fbe9ecfd0d4acc31cc04604b0dc9f3
BLAKE2b-256 25e59757bbb7084d60db4b135ddee854b0c11cdb093125664378bc55ad31ba3c

See more details on using hashes here.

File details

Details for the file datap_rs_client-0.4.23-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for datap_rs_client-0.4.23-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d5716e4a15d05f63f4ad745c243d435ddc7d00ef81e02b763442dfd48ae7e4e9
MD5 f927afb80878cbfd1e308ce305f1c1b7
BLAKE2b-256 8b231986586b60b3f9d48a80baed441883964f46a099e314d3d392074417d05d

See more details on using hashes here.

File details

Details for the file datap_rs_client-0.4.23-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for datap_rs_client-0.4.23-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 779428fe1e44ca4819f0c5b3df244b7eb064c66d9840bd32c4f62e2eb830b7b7
MD5 7e3284e4b1e6ab7ab65a2c58aea9fc0c
BLAKE2b-256 e4f6f0c04354aa4a87e86ab3ba51a7cc1803eb2ac044e9b31c8049a799a5424e

See more details on using hashes here.

File details

Details for the file datap_rs_client-0.4.23-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for datap_rs_client-0.4.23-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5d5f1fb5b180c0539e6bd4faa7a53d2d9c3b193586e0e2eb5582e745138e1c6d
MD5 5e0c5ef957cbee35a3691ec7916d4934
BLAKE2b-256 1c7716fde1b7cd427cc8f3b2036e3056265726b958d8c2b609a86970b2a3320d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page