Skip to main content

Strongly-typed DataFrames for Python, powered by Rust.

Project description

PydanTable

CI Documentation PyPI version Python versions License: MIT

Typed dataframe transformations for FastAPI and Pydantic services, backed by a Rust execution core (Polars inside the native extension).

Current release: 0.19.0 · Python 3.10+


At a glance

  • Schemas first: Pydantic field annotations define column types, nullability (T | None), and which expressions are legal. Many mistakes are caught when you build the Expr, not only when you run the query.
  • Two entry styles: DataFrameModel (SQLModel-like whole-table class with a generated row model) or DataFrame[YourSchema](data) with any Pydantic BaseModel schema.
  • Polars-shaped API: select, with_columns, filter, join, group_by, windows, reshape helpers — semantics are documented in the interface contract, not guaranteed identical to Polars on every edge case.
  • Optional extras: pydantable[polars] for to_polars(); pydantable[arrow] for read_parquet / read_ipc, to_arrow / ato_arrow, and pa.Table / RecordBatch constructors.
  • Optional façades: pydantable.pandas and pydantable.pyspark swap naming/imports; execution stays the same in-process core (not a real Spark or pandas backend).
  • Service-ready: Sync and async materialization (collect, to_dict, acollect, ato_dict, …), FastAPI patterns, and trusted ingest modes for bulk JSON or Arrow.

Documentation

The canonical manual is on Read the Docs: https://pydantable.readthedocs.io/en/latest/

Topic Read the Docs
Home / overview Documentation home
Changelog & versions Changelog · Versioning (0.x)
DataFrameModel (inputs, transforms, collisions, materialization) DataFrameModel
Column types (scalars, structs, list[T], maps, trusted ingest) Supported data types
FastAPI (routers, bodies, async, multipart) FastAPI integration
Execution (collect, to_dict, to_polars, to_arrow, async) Execution
Semantics (nulls, joins, windows, reshape) Interface contract
Roadmap (shipped 0.19.0, path to v1.0.0) Roadmap
Why not Polars alone? Why not just use Polars?
Pandas-style API (pydantable.pandas) Pandas UI
PySpark-style API (pydantable.pyspark) PySpark UI · Parity matrix
Polars parity Scorecard · Workflows · Transformation roadmap
Contributors Developer guide
Architecture plan Plan document
Python API (autodoc) API reference

Install

pip install pydantable

Optional dependencies (same package, feature extras):

pip install 'pydantable[polars]'   # to_polars()
pip install 'pydantable[arrow]'  # read_parquet/read_ipc, to_arrow, Table/RecordBatch constructors

From a git checkout you need a Rust toolchain and a build of the extension (e.g. Maturin):

pip install .
# editable: maturin develop --manifest-path pydantable-core/Cargo.toml

Full setup, make check-full, and release notes: Developer guide.


Quick start

from pydantable import DataFrameModel

class User(DataFrameModel):
    id: int
    age: int | None

df = User({"id": [1, 2], "age": [20, None]})
df2 = df.with_columns(age2=df.age * 2)
df3 = df2.select("id", "age2")
df4 = df3.filter(df3.age2 > 10)

# Columnar dict (good for JSON APIs)
print(df4.to_dict())
# {'age2': [40], 'id': [1]}

# List of Pydantic row models (default collect)
for row in df4.collect():
    print(row.id, row.age2)

Materialization: collect()list of row models; to_dict() / collect(as_lists=True)dict[str, list]; to_polars() / to_arrow() when the matching extra is installed. Async: acollect, ato_dict, ato_polars, ato_arrow offload blocking work from the event loop (Execution, FastAPI).

Alternate import styles (same engine):

from pydantable.pandas import DataFrameModel as PandasDataFrameModel
from pydantable.pyspark import DataFrameModel as PySparkDataFrameModel
from pydantable import DataFrameModel as DefaultDataFrameModel

More examples: FastAPI, Polars-style workflows.

Validation policy: Constructors validate strictly by default. For messy row lists, ignore_errors=True plus on_validation_errors=callback receives failed rows (row_index, row, Pydantic errors). Trusted bulk paths use trusted_mode (off / shape_only / strict). Details: DataFrameModel, Supported types.


Expression & API surface

Typed Expr builds a Rust AST. Highlights:

  • Globals in select: global_sum, global_mean, global_count, global_min, global_max, global_row_count() (row count). PySpark façade: F.count() with no argument = row count.
  • Windows: row_number, rank, dense_rank, window_sum, window_mean, window_min, window_max, lag, lead with Window.partitionBy(...).orderBy(..., nulls_last=...); framed rowsBetween / rangeBetween where supported (window semantics).
  • Temporal & strings: strptime, unix_timestamp, cast to date/datetime, dt_* parts, strip / lower / upper, str_replace, strip_prefix / suffix / chars, list helpers (list_len, list_get, …).
  • Maps (string keys): map_len, map_get, map_contains_key, map_keys, map_values, map_entries, map_from_entries, element_at; binary_len for bytes columns.

PySpark-named wrappers: pydantable.pyspark.sql.functions mirrors much of the above (parity table).


Recent releases

0.19.0 — Pre-1.0 documentation consolidation: Versioning (0.x), interface contract cross-links, parity/README/index refresh for the 0.19 → 1.0 path, PERFORMANCE benchmark spot-check note, release-hygiene alignment with CI; group_by tests sort output where row order is not guaranteed (stable pytest-xdist). No new Expr or PySpark façade methods.

0.18.0 — Clearer Polars error context for group_by().agg(); explicit deferral of non-string map keys (Supported types, Roadmap); parity/roadmap doc refresh (no new façade APIs); Hypothesis smoke for join / group_by.

0.17.0 — Tighter docs and tests for map_get / map_contains_key after PyArrow map<utf8, …> ingest; more pyspark.sql.functions thin wrappers (str_replace, regexp_replace, strip_*, strptime, binary_len, list_*). Non-string map keys (dict[int, T], etc.) remain future work (Roadmap Later).

0.16.x — Arrow interchange (read_parquet / read_ipc, to_arrow / ato_arrow, Table/RecordBatch constructors), FastAPI multipart and deployment docs, map-column arithmetic TypeError fix, DataFrame[Schema](pa.Table) constructor fix.

Older highlights: 0.15.0 async materialization and Arrow map ingest; 0.14.0 window null ordering and FastAPI TestClient coverage. Full history: Changelog.


Development

From a clone with .venv and pip install -e ".[dev]" plus a built extension:

make check-full              # Ruff, mypy, Rust fmt / clippy / tests
PYTHONPATH=python pytest -q  # integration tests (see DEVELOPER.md)

Rust tests need the Makefile PYO3_PYTHON / PYTHONPATH wiring: make rust-test. Details: Developer guide.


License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydantable-0.19.0.tar.gz (144.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pydantable-0.19.0-cp313-cp313-win_arm64.whl (17.2 MB view details)

Uploaded CPython 3.13Windows ARM64

pydantable-0.19.0-cp313-cp313-macosx_11_0_arm64.whl (17.2 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

pydantable-0.19.0-cp313-cp313-macosx_10_12_x86_64.whl (18.8 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

pydantable-0.19.0-cp312-cp312-win_amd64.whl (19.0 MB view details)

Uploaded CPython 3.12Windows x86-64

pydantable-0.19.0-cp312-cp312-musllinux_1_2_x86_64.whl (17.3 MB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ x86-64

pydantable-0.19.0-cp312-cp312-musllinux_1_2_aarch64.whl (15.8 MB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ ARM64

pydantable-0.19.0-cp312-cp312-macosx_11_0_arm64.whl (17.2 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

pydantable-0.19.0-cp312-cp312-macosx_10_12_x86_64.whl (18.8 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

pydantable-0.19.0-cp311-cp311-manylinux_2_28_aarch64.whl (18.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

pydantable-0.19.0-cp311-cp311-macosx_11_0_arm64.whl (17.2 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

pydantable-0.19.0-cp311-cp311-macosx_10_12_x86_64.whl (18.8 MB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

pydantable-0.19.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (19.0 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

File details

Details for the file pydantable-0.19.0.tar.gz.

File metadata

  • Download URL: pydantable-0.19.0.tar.gz
  • Upload date:
  • Size: 144.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for pydantable-0.19.0.tar.gz
Algorithm Hash digest
SHA256 6234fe651234c58a1d6160d1a141e7609999d6db3b46099b8bfe8011ace61b65
MD5 ea34b2279b0228f8418a54f1371a3271
BLAKE2b-256 820539d9fcfc00435852d65c8607b94179bb9421f76829b3a38fe1b335b8a85c

See more details on using hashes here.

File details

Details for the file pydantable-0.19.0-cp313-cp313-win_arm64.whl.

File metadata

File hashes

Hashes for pydantable-0.19.0-cp313-cp313-win_arm64.whl
Algorithm Hash digest
SHA256 27e49e2a4dcb5e492763895c6a0e7d5777fb8d1c2880159552617b73e340b867
MD5 d059a1e2821467e3c37d12f76b4a7cb5
BLAKE2b-256 63e0269fb13b220fbada324ef646c36b26d2c3c008ce3b99de16ffba97a299b9

See more details on using hashes here.

File details

Details for the file pydantable-0.19.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pydantable-0.19.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 41eade40b520c9954a9dbfe47df8f3c7e70ddf9bae7d4c5018b3e09b6553f4f5
MD5 58170c79b13d4c21c615c44ea6be7236
BLAKE2b-256 194850a288809ff08ad3e35cb339eb10e8692ef717ec856da18980eb8ee7513e

See more details on using hashes here.

File details

Details for the file pydantable-0.19.0-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for pydantable-0.19.0-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 8bfc99cf820a9ca66d1e45d2393fcfe99aacdb6debdcfb817e4ee3071fff8e43
MD5 a817d0bfdc4bbf37f7393acc17144e6c
BLAKE2b-256 925443a99c19c4972b97406c90c9eb2c479491914f3140a37d6e0554f6fe57bc

See more details on using hashes here.

File details

Details for the file pydantable-0.19.0-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for pydantable-0.19.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 014bc9c134f7058398461a49da4c6782e4c0e021e2dafb7bf9a4b1f360a19d38
MD5 c74828cc291df4545cf95ddf159bb0b0
BLAKE2b-256 67f913d9d5c4baf687f2bae8e0cc1c2fc3f51d46f9ea13388fbe8336e229331e

See more details on using hashes here.

File details

Details for the file pydantable-0.19.0-cp312-cp312-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for pydantable-0.19.0-cp312-cp312-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 be8eee3ec48e10fc05697da3417aa346b01366eba788e8978b45ad4b8788fd3e
MD5 1248ddbc12ce0b0e93c723ba7f58ace9
BLAKE2b-256 14a3c4fa78f20c3578870af0789c9f2e3e31ef6653751e742d4065faeb4fdf76

See more details on using hashes here.

File details

Details for the file pydantable-0.19.0-cp312-cp312-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for pydantable-0.19.0-cp312-cp312-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 f0de8da8611ee71e13d4b89c1fe2cdab34af182ccb077046830a88a44e1f9595
MD5 b57bd8973ee26cad1bb3745ef815a2a5
BLAKE2b-256 0021358bb2509ef541b66bc472690fed2bb22c26a454ac6ba6a5a2a4e87ac93b

See more details on using hashes here.

File details

Details for the file pydantable-0.19.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pydantable-0.19.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 eba7610fffc16f4769c93da321292e19fda40e9868f243d071bd6a9434ade364
MD5 307ecdb65001c099eff459a386536a87
BLAKE2b-256 891cae71203f8df39497db7967dea34f17326357471baf0fde40f5164ed3e1f8

See more details on using hashes here.

File details

Details for the file pydantable-0.19.0-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for pydantable-0.19.0-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 f2138626baef51d723f5ea005a91d78d858976e67799cd14969947afa61d75ff
MD5 a1511924aa7768343254a0ba0a56a70b
BLAKE2b-256 b3ba4de0c25d02c5982e1953d73741766edc9fa07993e7d6a622c672ccb99174

See more details on using hashes here.

File details

Details for the file pydantable-0.19.0-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for pydantable-0.19.0-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 a7197c183fd18abf85c4ceeaec8e5ea9783ba9d18647db09bd61bf2e85dda2c7
MD5 b9eb324d1c5185de5668b6c926fdb5e9
BLAKE2b-256 699d86212671548833bc02b1038f18e7c0d30504486354fdc5d85e4010bdfe08

See more details on using hashes here.

File details

Details for the file pydantable-0.19.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pydantable-0.19.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 47eac767d7f97f1b3c31ca0610648dd30af0a205dab869c180eb8fe0e3fc1e66
MD5 a2b01e3a65f4a1886687ae637e02220f
BLAKE2b-256 6deef74244a1d34ca8f2c9f3e3dcc91d5cc7d51ae2a118e0d45709b2bd4ce84b

See more details on using hashes here.

File details

Details for the file pydantable-0.19.0-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for pydantable-0.19.0-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 e083d502f337434944e25ad564056d47395a19bfc16913b8fcd10e7c75495646
MD5 5cbb8a8a9aaf352ee1cb8d59d15aee15
BLAKE2b-256 6d97809a69942634a163c1d054edea0b02c09d02a2fd056a9520580a07cb9d76

See more details on using hashes here.

File details

Details for the file pydantable-0.19.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pydantable-0.19.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e7472425eda68f9f1ebad0c7564df8a0e0e580e0eee397aa44010237770da1e7
MD5 df60eb1c38a4bd2e41733fb0ec464607
BLAKE2b-256 e6727c700a2cae6ebd383fdb8a155493ac147a201483b4972fe7944821879557

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page