Skip to main content

Strongly-typed DataFrames for Python, powered by Rust.

Project description

PydanTable

CI Documentation PyPI version Python versions License: MIT

Typed dataframe transformations for FastAPI and Pydantic services, backed by a Rust execution core (Polars inside the native extension).

Current release: 0.18.0 · Python 3.10+


At a glance

  • Schemas first: Pydantic field annotations define column types, nullability (T | None), and which expressions are legal. Many mistakes are caught when you build the Expr, not only when you run the query.
  • Two entry styles: DataFrameModel (SQLModel-like whole-table class with a generated row model) or DataFrame[YourSchema](data) with any Pydantic BaseModel schema.
  • Polars-shaped API: select, with_columns, filter, join, group_by, windows, reshape helpers — semantics are documented in the interface contract, not guaranteed identical to Polars on every edge case.
  • Optional extras: pydantable[polars] for to_polars(); pydantable[arrow] for read_parquet / read_ipc, to_arrow / ato_arrow, and pa.Table / RecordBatch constructors.
  • Optional façades: pydantable.pandas and pydantable.pyspark swap naming/imports; execution stays the same in-process core (not a real Spark or pandas backend).
  • Service-ready: Sync and async materialization (collect, to_dict, acollect, ato_dict, …), FastAPI patterns, and trusted ingest modes for bulk JSON or Arrow.

Documentation

The canonical manual is on Read the Docs: https://pydantable.readthedocs.io/en/latest/

Topic Read the Docs
Home / overview Documentation home
Changelog & versions Changelog
DataFrameModel (inputs, transforms, collisions, materialization) DataFrameModel
Column types (scalars, structs, list[T], maps, trusted ingest) Supported data types
FastAPI (routers, bodies, async, multipart) FastAPI integration
Execution (collect, to_dict, to_polars, to_arrow, async) Execution
Semantics (nulls, joins, windows, reshape) Interface contract
Roadmap (shipped 0.18.0, planned 0.19+, path to v1.0.0) Roadmap
Why not Polars alone? Why not just use Polars?
Pandas-style API (pydantable.pandas) Pandas UI
PySpark-style API (pydantable.pyspark) PySpark UI · Parity matrix
Polars parity Scorecard · Workflows · Transformation roadmap
Contributors Developer guide
Architecture plan Plan document
Python API (autodoc) API reference

Install

pip install pydantable

Optional dependencies (same package, feature extras):

pip install 'pydantable[polars]'   # to_polars()
pip install 'pydantable[arrow]'  # read_parquet/read_ipc, to_arrow, Table/RecordBatch constructors

From a git checkout you need a Rust toolchain and a build of the extension (e.g. Maturin):

pip install .
# editable: maturin develop --manifest-path pydantable-core/Cargo.toml

Full setup, make check-full, and release notes: Developer guide.


Quick start

from pydantable import DataFrameModel

class User(DataFrameModel):
    id: int
    age: int | None

df = User({"id": [1, 2], "age": [20, None]})
df2 = df.with_columns(age2=df.age * 2)
df3 = df2.select("id", "age2")
df4 = df3.filter(df3.age2 > 10)

# Columnar dict (good for JSON APIs)
print(df4.to_dict())
# {'age2': [40], 'id': [1]}

# List of Pydantic row models (default collect)
for row in df4.collect():
    print(row.id, row.age2)

Materialization: collect()list of row models; to_dict() / collect(as_lists=True)dict[str, list]; to_polars() / to_arrow() when the matching extra is installed. Async: acollect, ato_dict, ato_polars, ato_arrow offload blocking work from the event loop (Execution, FastAPI).

Alternate import styles (same engine):

from pydantable.pandas import DataFrameModel as PandasDataFrameModel
from pydantable.pyspark import DataFrameModel as PySparkDataFrameModel
from pydantable import DataFrameModel as DefaultDataFrameModel

More examples: FastAPI, Polars-style workflows.

Validation policy: Constructors validate strictly by default. For messy row lists, ignore_errors=True plus on_validation_errors=callback receives failed rows (row_index, row, Pydantic errors). Trusted bulk paths use trusted_mode (off / shape_only / strict). Details: DataFrameModel, Supported types.


Expression & API surface

Typed Expr builds a Rust AST. Highlights:

  • Globals in select: global_sum, global_mean, global_count, global_min, global_max, global_row_count() (row count). PySpark façade: F.count() with no argument = row count.
  • Windows: row_number, rank, dense_rank, window_sum, window_mean, window_min, window_max, lag, lead with Window.partitionBy(...).orderBy(..., nulls_last=...); framed rowsBetween / rangeBetween where supported (window semantics).
  • Temporal & strings: strptime, unix_timestamp, cast to date/datetime, dt_* parts, strip / lower / upper, str_replace, strip_prefix / suffix / chars, list helpers (list_len, list_get, …).
  • Maps (string keys): map_len, map_get, map_contains_key, map_keys, map_values, map_entries, map_from_entries, element_at; binary_len for bytes columns.

PySpark-named wrappers: pydantable.pyspark.sql.functions mirrors much of the above (parity table).


Recent releases

0.18.0 — Clearer Polars error context for group_by().agg(); explicit deferral of non-string map keys (Supported types, Roadmap); parity/roadmap doc refresh (no new façade APIs); Hypothesis smoke for join / group_by.

0.17.0 — Tighter docs and tests for map_get / map_contains_key after PyArrow map<utf8, …> ingest; more pyspark.sql.functions thin wrappers (str_replace, regexp_replace, strip_*, strptime, binary_len, list_*). Non-string map keys (dict[int, T], etc.) remain future work (Roadmap Later).

0.16.x — Arrow interchange (read_parquet / read_ipc, to_arrow / ato_arrow, Table/RecordBatch constructors), FastAPI multipart and deployment docs, map-column arithmetic TypeError fix, DataFrame[Schema](pa.Table) constructor fix.

Older highlights: 0.15.0 async materialization and Arrow map ingest; 0.14.0 window null ordering and FastAPI TestClient coverage. Full history: Changelog.


Development

From a clone with .venv and pip install -e ".[dev]" plus a built extension:

make check-full              # Ruff, mypy, Rust fmt / clippy / tests
PYTHONPATH=python pytest -q  # integration tests (see DEVELOPER.md)

Rust tests need the Makefile PYO3_PYTHON / PYTHONPATH wiring: make rust-test. Details: Developer guide.


License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydantable-0.18.0.tar.gz (143.6 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pydantable-0.18.0-cp313-cp313-win_arm64.whl (17.2 MB view details)

Uploaded CPython 3.13Windows ARM64

pydantable-0.18.0-cp313-cp313-macosx_11_0_arm64.whl (17.2 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

pydantable-0.18.0-cp313-cp313-macosx_10_12_x86_64.whl (18.8 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

pydantable-0.18.0-cp312-cp312-win_amd64.whl (19.0 MB view details)

Uploaded CPython 3.12Windows x86-64

pydantable-0.18.0-cp312-cp312-musllinux_1_2_x86_64.whl (17.3 MB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ x86-64

pydantable-0.18.0-cp312-cp312-musllinux_1_2_aarch64.whl (15.8 MB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ ARM64

pydantable-0.18.0-cp312-cp312-macosx_11_0_arm64.whl (17.2 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

pydantable-0.18.0-cp312-cp312-macosx_10_12_x86_64.whl (18.8 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

pydantable-0.18.0-cp311-cp311-manylinux_2_28_aarch64.whl (18.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

pydantable-0.18.0-cp311-cp311-macosx_11_0_arm64.whl (17.2 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

pydantable-0.18.0-cp311-cp311-macosx_10_12_x86_64.whl (18.8 MB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

pydantable-0.18.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (19.0 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

File details

Details for the file pydantable-0.18.0.tar.gz.

File metadata

  • Download URL: pydantable-0.18.0.tar.gz
  • Upload date:
  • Size: 143.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for pydantable-0.18.0.tar.gz
Algorithm Hash digest
SHA256 2d7c32b73eb1695e2419c8ead7db20af4eabf45167db0208d02932ea7a93d199
MD5 3b87329eeb857bfc4011cd5346a95e8f
BLAKE2b-256 3c9bac4d9c66c14523307cbdd758f477ea867ff76c8205179a63e87ab3344b48

See more details on using hashes here.

File details

Details for the file pydantable-0.18.0-cp313-cp313-win_arm64.whl.

File metadata

File hashes

Hashes for pydantable-0.18.0-cp313-cp313-win_arm64.whl
Algorithm Hash digest
SHA256 b0644c8abce9dc6bf80a669fef7c55e2c4a755836efbd3011e856655c764e192
MD5 1ffbb2718bbc85dd0aed3f724660c284
BLAKE2b-256 d19fee386892e788f9e3eb6f27aad27e1e619ff0894c694009739bac90acb9d0

See more details on using hashes here.

File details

Details for the file pydantable-0.18.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pydantable-0.18.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 30e1da3a0ea9be28f6ecb1ac07557c4691dc844ead537d8b763ecf461672ff99
MD5 c3d74cfa1a582a92b646059980125f75
BLAKE2b-256 c2a0d87c36a420b527bd4d8fd9f54ebf58365afda577bbc7257d63a9505a2893

See more details on using hashes here.

File details

Details for the file pydantable-0.18.0-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for pydantable-0.18.0-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 db30e96dbc3c8326d4320d43e0b2300c69aa348ab2cc869af890846274a2a358
MD5 272b9191b426f6f99d43a24ad609d94c
BLAKE2b-256 61fedbfab6804ef575b0c5eae945e3d6ec587a74a7c682305df842ca71d1dc1f

See more details on using hashes here.

File details

Details for the file pydantable-0.18.0-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for pydantable-0.18.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 006f2c9854bb8219672a87fe940b8db7d4c5a9acde601011f7df4e1200f17a4a
MD5 e5c530fce4dd9ca43364c344a241dac1
BLAKE2b-256 f9016150f94ca24e5acb6c846f79b450f9d88ad8a1ba520db3d50d5d3a2a217b

See more details on using hashes here.

File details

Details for the file pydantable-0.18.0-cp312-cp312-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for pydantable-0.18.0-cp312-cp312-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 577059dce3a51f310b1cc1f686cc6ecc1946db1b831719527506fda055a12047
MD5 212e90a76b807770e3c4a20755509851
BLAKE2b-256 3fa9e71df522d109272365a30f0eb5430ec42a228dfaefa308bb0e330eb1a129

See more details on using hashes here.

File details

Details for the file pydantable-0.18.0-cp312-cp312-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for pydantable-0.18.0-cp312-cp312-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 1f262cc6ccaf0658a64af33661578bef2270517830bc5a6f6594ffbc535fc9d4
MD5 4450a1833a1d88e8fedf4e4abe967aeb
BLAKE2b-256 1f2dcb8c2327d0901acb546ae431ecd5f4cafa9cfef226f2156c288aa583f5f0

See more details on using hashes here.

File details

Details for the file pydantable-0.18.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pydantable-0.18.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 615f081fcc22d088a92cc32b6d62a37ba8b789d0aa5937f02928205177a06875
MD5 6c11d4395df249b6010fff0286951d73
BLAKE2b-256 7ff6e2e7faa15dcd10bbc88add2ac0e8c5b49be05a3bddfe5c008adf4797b6ff

See more details on using hashes here.

File details

Details for the file pydantable-0.18.0-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for pydantable-0.18.0-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 3337a1313e7e627919922a15aa6592d5016199984db79ef6803fe150d7bee236
MD5 783f94d2e72de73dc7e444175ea08d80
BLAKE2b-256 cbe8b5f4526cb62eb9f7c39ef09adb8b792b47081a87824c8326c94e64f31103

See more details on using hashes here.

File details

Details for the file pydantable-0.18.0-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for pydantable-0.18.0-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 9b9c06388155ba544522a3cccc2e55401b2faad74d95be080b0f2ebac115734f
MD5 6c120b40f493410e7832825fda425fea
BLAKE2b-256 6580c8a4c1d8277d3e6247d3897137e8e006ec55c9c7c3dcbd70d3294ac07a8a

See more details on using hashes here.

File details

Details for the file pydantable-0.18.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pydantable-0.18.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 1fbf0d77adb495947a4e31ee0baa35a60b68dde1ac8cf1b5aa46799f943d6b0a
MD5 75a7ca31483824a9dfaf13de4e811158
BLAKE2b-256 df52532f2d3fbe572621b544705e34e1430d4533dca51ec94ed45a41bc0e2a30

See more details on using hashes here.

File details

Details for the file pydantable-0.18.0-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for pydantable-0.18.0-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 889b54357effd32e2a51f77546f9ac20671bddad1a081d698121abe6fa40de86
MD5 213b7411325bc6c68c0e002e9678112b
BLAKE2b-256 026f0170c7a23090cdbf7c9139ef2569df836e23a21b87957f9109221061a002

See more details on using hashes here.

File details

Details for the file pydantable-0.18.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pydantable-0.18.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 ccfef1dfd8cad37671c0a9b4f61d5235b03148e6581d1c9fe15791e784946170
MD5 11f608f3fd4923fde88d419257e0b0ca
BLAKE2b-256 9e45ca2aea553553c5dd13c209eceb4c184923437eac778c96acb70e0c833bf2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page