Skip to main content

Python client for a DataPress dataset server, backed by the native Rust client (datapress-client).

Project description

datap-rs-client

Python client for a DataPress dataset server, backed by the native Rust client (datapress-client) via PyO3.

Requests are plain Python dicts; responses come back as dicts. Structured queries can optionally be decoded into a pyarrow.Table.

Install

pip install datap-rs-client          # core
pip install datap-rs-client[arrow]   # + pyarrow for query_arrow()

This project standardises on uv: uv pip install datap-rs-client[arrow].

Usage

from datap_rs_client import DataPressClient

client = DataPressClient("http://127.0.0.1:8000")

client.datasets()
# ['accidents']

client.count("accidents", predicates=[{"col": "Severity", "op": "gte", "val": 3}])
# 123456

rows = client.query(
    "accidents",
    columns=["State", "Severity"],
    predicates=[{"col": "Severity", "op": "gte", "val": 3}],
    page_size=1000,
)
rows["page"], len(rows["data"])
# (1, 1000)

# Arrow (requires the [arrow] extra)
table = client.query_arrow("accidents", columns=["State", "Severity"], page_size=100_000)
table.num_rows

Authentication

client = DataPressClient(
    "http://127.0.0.1:8000",
    bearer_token="…",     # servers with auth enabled
    admin_token="…",      # required by reload()
)

SQL

client.sql("SELECT State, COUNT(*) AS n FROM accidents GROUP BY State", max_rows=100)

DataFrames

query_arrow(...) returns a pyarrow.Table (install the [arrow] extra). Arrow is the zero-copy interchange format for every popular dataframe library, so a single query feeds them all:

from datap_rs_client import DataPressClient

client = DataPressClient("http://127.0.0.1:8000")
table = client.query_arrow(
    "accidents",
    columns=["State", "Severity"],
    predicates=[{"col": "Severity", "op": "gte", "val": 3}],
    page_size=1_000_000,
)

Polars

import polars as pl

# Zero-copy from the Arrow table.
df = pl.from_arrow(table)
df.group_by("State").len().sort("len", descending=True)

pandas

import pandas as pd  # noqa: F401  (pyarrow drives the conversion)

# Arrow-backed dtypes (recommended) …
df = table.to_pandas(types_mapper=pd.ArrowDtype)
# … or classic NumPy-backed dtypes:
df = table.to_pandas()
df.groupby("State")["Severity"].mean()

DuckDB

import duckdb

# DuckDB queries the Arrow table in place — no copy, no temp files.
duckdb.sql("SELECT State, COUNT(*) AS n FROM table GROUP BY State ORDER BY n DESC")

PySpark

from pyspark.sql import SparkSession

spark = SparkSession.builder.getOrCreate()
# Spark has no direct Arrow-table constructor; go via pandas (Arrow-accelerated).
sdf = spark.createDataFrame(table.to_pandas())
sdf.groupBy("State").count().orderBy("count", ascending=False).show()

DataFusion

from datafusion import SessionContext

ctx = SessionContext()
df = ctx.from_arrow(table)
df.aggregate([df["State"]], [df["Severity"].mean()])

PyArrow / Arrow ecosystem

# The result is already a pyarrow.Table.
table.column("Severity").combine_chunks()
table.to_batches()           # -> list[pyarrow.RecordBatch]
table.to_pydict()            # -> dict[str, list]

Anything implementing the Arrow C Data Interface (Polars, DuckDB, DataFusion, Vaex, cuDF, …) can consume the table directly. For libraries without an Arrow constructor, table.to_pandas() is the universal fallback.

Relationship to datap-rs

datap-rs ships the server; datap-rs-client is a standalone client. They are independent packages — install whichever you need.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datap_rs_client-0.4.15.tar.gz (69.4 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

datap_rs_client-0.4.15-cp39-abi3-win_amd64.whl (1.9 MB view details)

Uploaded CPython 3.9+Windows x86-64

datap_rs_client-0.4.15-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.0 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ x86-64

datap_rs_client-0.4.15-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.9 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ ARM64

datap_rs_client-0.4.15-cp39-abi3-macosx_11_0_arm64.whl (1.8 MB view details)

Uploaded CPython 3.9+macOS 11.0+ ARM64

File details

Details for the file datap_rs_client-0.4.15.tar.gz.

File metadata

  • Download URL: datap_rs_client-0.4.15.tar.gz
  • Upload date:
  • Size: 69.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for datap_rs_client-0.4.15.tar.gz
Algorithm Hash digest
SHA256 534dc563fa0f5ddbe4d12f8632462fd4d315230b2d96c4d48f96e1628a883909
MD5 b67d9e6dd2614885fa53afc3243b6790
BLAKE2b-256 4f0979297d2392b2459f5e63830cda63abcdb85d4b5f65ce9313606c6334414b

See more details on using hashes here.

File details

Details for the file datap_rs_client-0.4.15-cp39-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for datap_rs_client-0.4.15-cp39-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 651e1e95765fb11aa64b78f51001f7e2f8c320f1e60eb92628dc7a6521406f6e
MD5 48b71a7a7d0f4b38c73618efc9e0a721
BLAKE2b-256 7e82c20fa0bbf46b49a53be664abc5100ed56cf6c321bcc4a9e8360c568111c5

See more details on using hashes here.

File details

Details for the file datap_rs_client-0.4.15-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for datap_rs_client-0.4.15-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4af9c96420399eafd3cfa0a875fd657784ce799585f95aa85eeb4d1496c29096
MD5 365e977f6d6eef07ffe117582615fc1c
BLAKE2b-256 e4380b9e84f9c945c5ea292e082372bcc379e8a3ca0909d4de3368d4d8a891de

See more details on using hashes here.

File details

Details for the file datap_rs_client-0.4.15-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for datap_rs_client-0.4.15-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 171cc6416c6fd50ebc295a69c6b3da8e8882ff568e09659eb378392a76431f7d
MD5 4d85ef0406a7731d3d5720525cb79b34
BLAKE2b-256 96a57f9545e6a36ef16cd99c87930f1f63fa2cb8bc882209626618fd021458ba

See more details on using hashes here.

File details

Details for the file datap_rs_client-0.4.15-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for datap_rs_client-0.4.15-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 8b74807721c1e905b70ffc23e04eac3652dcac0f9adc9efbd98de1f24dbfe8f6
MD5 08ade20ba2ec149a9ddf7dde1e86f36f
BLAKE2b-256 27feee1416a7e4b51a4728e474445b457fe04e8231b1aa6e44f8ecb6d17331c7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page