Skip to main content

Python client for a DataPress dataset server, backed by the native Rust client (datapress-client).

Project description

datap-rs-client

Python client for a DataPress dataset server, backed by the native Rust client (datapress-client) via PyO3.

Requests are plain Python dicts; responses come back as dicts. Structured queries can optionally be decoded into a pyarrow.Table.

Install

pip install datap-rs-client          # core
pip install datap-rs-client[arrow]   # + pyarrow for query_arrow()

This project standardises on uv: uv pip install datap-rs-client[arrow].

Usage

from datap_rs_client import DataPressClient

client = DataPressClient("http://127.0.0.1:8000")

client.datasets()
# ['accidents']

client.count("accidents", predicates=[{"col": "Severity", "op": "gte", "val": 3}])
# 123456

rows = client.query(
    "accidents",
    columns=["State", "Severity"],
    predicates=[{"col": "Severity", "op": "gte", "val": 3}],
    page_size=1000,
)
rows["page"], len(rows["data"])
# (1, 1000)

# Arrow (requires the [arrow] extra)
table = client.query_arrow("accidents", columns=["State", "Severity"], page_size=100_000)
table.num_rows

Authentication

client = DataPressClient(
    "http://127.0.0.1:8000",
    bearer_token="…",     # servers with auth enabled
    admin_token="…",      # required by reload()
)

SQL

client.sql("SELECT State, COUNT(*) AS n FROM accidents GROUP BY State", max_rows=100)

DataFrames

query_arrow(...) returns a pyarrow.Table (install the [arrow] extra). Arrow is the zero-copy interchange format for every popular dataframe library, so a single query feeds them all:

from datap_rs_client import DataPressClient

client = DataPressClient("http://127.0.0.1:8000")
table = client.query_arrow(
    "accidents",
    columns=["State", "Severity"],
    predicates=[{"col": "Severity", "op": "gte", "val": 3}],
    page_size=1_000_000,
)

Polars

import polars as pl

# Zero-copy from the Arrow table.
df = pl.from_arrow(table)
df.group_by("State").len().sort("len", descending=True)

pandas

import pandas as pd  # noqa: F401  (pyarrow drives the conversion)

# Arrow-backed dtypes (recommended) …
df = table.to_pandas(types_mapper=pd.ArrowDtype)
# … or classic NumPy-backed dtypes:
df = table.to_pandas()
df.groupby("State")["Severity"].mean()

DuckDB

import duckdb

# DuckDB queries the Arrow table in place — no copy, no temp files.
duckdb.sql("SELECT State, COUNT(*) AS n FROM table GROUP BY State ORDER BY n DESC")

PySpark

from pyspark.sql import SparkSession

spark = SparkSession.builder.getOrCreate()
# Spark has no direct Arrow-table constructor; go via pandas (Arrow-accelerated).
sdf = spark.createDataFrame(table.to_pandas())
sdf.groupBy("State").count().orderBy("count", ascending=False).show()

DataFusion

from datafusion import SessionContext

ctx = SessionContext()
df = ctx.from_arrow(table)
df.aggregate([df["State"]], [df["Severity"].mean()])

PyArrow / Arrow ecosystem

# The result is already a pyarrow.Table.
table.column("Severity").combine_chunks()
table.to_batches()           # -> list[pyarrow.RecordBatch]
table.to_pydict()            # -> dict[str, list]

Anything implementing the Arrow C Data Interface (Polars, DuckDB, DataFusion, Vaex, cuDF, …) can consume the table directly. For libraries without an Arrow constructor, table.to_pandas() is the universal fallback.

Relationship to datap-rs

datap-rs ships the server; datap-rs-client is a standalone client. They are independent packages — install whichever you need.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datap_rs_client-0.4.17.tar.gz (69.4 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

datap_rs_client-0.4.17-cp39-abi3-win_amd64.whl (1.9 MB view details)

Uploaded CPython 3.9+Windows x86-64

datap_rs_client-0.4.17-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.0 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ x86-64

datap_rs_client-0.4.17-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.9 MB view details)

Uploaded CPython 3.9+manylinux: glibc 2.17+ ARM64

datap_rs_client-0.4.17-cp39-abi3-macosx_11_0_arm64.whl (1.8 MB view details)

Uploaded CPython 3.9+macOS 11.0+ ARM64

File details

Details for the file datap_rs_client-0.4.17.tar.gz.

File metadata

  • Download URL: datap_rs_client-0.4.17.tar.gz
  • Upload date:
  • Size: 69.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for datap_rs_client-0.4.17.tar.gz
Algorithm Hash digest
SHA256 d395d49cdd2fd91a79eb6ffcd93908de70cf08dc0f4fbda553cd0e766b7c0824
MD5 cc76a62ceb9eda979e36644012d3727a
BLAKE2b-256 8948aea9232c3b027db4a7ebb6c1fc40c5b73bc1a5cdc52de3827f8c6e4c98f3

See more details on using hashes here.

File details

Details for the file datap_rs_client-0.4.17-cp39-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for datap_rs_client-0.4.17-cp39-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 63a3cb60bc05b66101e322a385a7b7354cbe20ae39ac0149df3e7b0a36ca86e1
MD5 b865cfcf9f643086bb261a08908fa4cc
BLAKE2b-256 f6215305cd6fe4232637f6d93fbf4059ebdd843e9f1a8f1d5cf8ff090bf79f4e

See more details on using hashes here.

File details

Details for the file datap_rs_client-0.4.17-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for datap_rs_client-0.4.17-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 20a857b7370865321af41762011997c1f266b1115567cd571d5cbe7e7f01b21d
MD5 22bc531f4224cb3224ebf75dcce9976d
BLAKE2b-256 30cd9807547149d7aa8b5edee57d237f4bb4ddf553f21bb40db1b8780437e526

See more details on using hashes here.

File details

Details for the file datap_rs_client-0.4.17-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for datap_rs_client-0.4.17-cp39-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 7e4aa4d6c62c3400d83e5966debaee1ceb6aec6a4d60eccf6fe83a656fb27840
MD5 07f36e1dc32ea96992245314a820a7bf
BLAKE2b-256 3c6050fd657dc479337cfc530264b2ba43c3ecd2b6967cb8f4d7af4ea651dd2c

See more details on using hashes here.

File details

Details for the file datap_rs_client-0.4.17-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for datap_rs_client-0.4.17-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 62ad8375f3f05dd5fdf776c360985d7579a9fe55ef89109aa7713bcd653a951f
MD5 5a64ac2adedd30952c0c867dba9a1aeb
BLAKE2b-256 fc1d8171ce039cbeaf99019234499cfa1e7bdc10a1b7c60eeff83172277a00fd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page