Skip to main content

Strongly-typed DataFrames for Python, powered by Rust.

Project description

PydanTable

CI Documentation PyPI version Python versions License: MIT

Typed dataframe transformations for FastAPI and Pydantic services, backed by a Rust execution core (Polars inside the native extension).

Current release: 0.17.0 · Python 3.10+


At a glance

  • Schemas first: Pydantic field annotations define column types, nullability (T | None), and which expressions are legal. Many mistakes are caught when you build the Expr, not only when you run the query.
  • Two entry styles: DataFrameModel (SQLModel-like whole-table class with a generated row model) or DataFrame[YourSchema](data) with any Pydantic BaseModel schema.
  • Polars-shaped API: select, with_columns, filter, join, group_by, windows, reshape helpers — semantics are documented in the interface contract, not guaranteed identical to Polars on every edge case.
  • Optional extras: pydantable[polars] for to_polars(); pydantable[arrow] for read_parquet / read_ipc, to_arrow / ato_arrow, and pa.Table / RecordBatch constructors.
  • Optional façades: pydantable.pandas and pydantable.pyspark swap naming/imports; execution stays the same in-process core (not a real Spark or pandas backend).
  • Service-ready: Sync and async materialization (collect, to_dict, acollect, ato_dict, …), FastAPI patterns, and trusted ingest modes for bulk JSON or Arrow.

Documentation

The canonical manual is on Read the Docs: https://pydantable.readthedocs.io/en/latest/

Topic Read the Docs
Home / overview Documentation home
Changelog & versions Changelog
DataFrameModel (inputs, transforms, collisions, materialization) DataFrameModel
Column types (scalars, structs, list[T], maps, trusted ingest) Supported data types
FastAPI (routers, bodies, async, multipart) FastAPI integration
Execution (collect, to_dict, to_polars, to_arrow, async) Execution
Semantics (nulls, joins, windows, reshape) Interface contract
Roadmap (shipped 0.17.0, planned 0.18+, path to v1.0.0) Roadmap
Why not Polars alone? Why not just use Polars?
Pandas-style API (pydantable.pandas) Pandas UI
PySpark-style API (pydantable.pyspark) PySpark UI · Parity matrix
Polars parity Scorecard · Workflows · Transformation roadmap
Contributors Developer guide
Architecture plan Plan document
Python API (autodoc) API reference

Install

pip install pydantable

Optional dependencies (same package, feature extras):

pip install 'pydantable[polars]'   # to_polars()
pip install 'pydantable[arrow]'  # read_parquet/read_ipc, to_arrow, Table/RecordBatch constructors

From a git checkout you need a Rust toolchain and a build of the extension (e.g. Maturin):

pip install .
# editable: maturin develop --manifest-path pydantable-core/Cargo.toml

Full setup, make check-full, and release notes: Developer guide.


Quick start

from pydantable import DataFrameModel

class User(DataFrameModel):
    id: int
    age: int | None

df = User({"id": [1, 2], "age": [20, None]})
df2 = df.with_columns(age2=df.age * 2)
df3 = df2.select("id", "age2")
df4 = df3.filter(df3.age2 > 10)

# Columnar dict (good for JSON APIs)
print(df4.to_dict())
# {'age2': [40], 'id': [1]}

# List of Pydantic row models (default collect)
for row in df4.collect():
    print(row.id, row.age2)

Materialization: collect()list of row models; to_dict() / collect(as_lists=True)dict[str, list]; to_polars() / to_arrow() when the matching extra is installed. Async: acollect, ato_dict, ato_polars, ato_arrow offload blocking work from the event loop (Execution, FastAPI).

Alternate import styles (same engine):

from pydantable.pandas import DataFrameModel as PandasDataFrameModel
from pydantable.pyspark import DataFrameModel as PySparkDataFrameModel
from pydantable import DataFrameModel as DefaultDataFrameModel

More examples: FastAPI, Polars-style workflows.

Validation policy: Constructors validate strictly by default. For messy row lists, ignore_errors=True plus on_validation_errors=callback receives failed rows (row_index, row, Pydantic errors). Trusted bulk paths use trusted_mode (off / shape_only / strict). Details: DataFrameModel, Supported types.


Expression & API surface

Typed Expr builds a Rust AST. Highlights:

  • Globals in select: global_sum, global_mean, global_count, global_min, global_max, global_row_count() (row count). PySpark façade: F.count() with no argument = row count.
  • Windows: row_number, rank, dense_rank, window_sum, window_mean, window_min, window_max, lag, lead with Window.partitionBy(...).orderBy(..., nulls_last=...); framed rowsBetween / rangeBetween where supported (window semantics).
  • Temporal & strings: strptime, unix_timestamp, cast to date/datetime, dt_* parts, strip / lower / upper, str_replace, strip_prefix / suffix / chars, list helpers (list_len, list_get, …).
  • Maps (string keys): map_len, map_get, map_contains_key, map_keys, map_values, map_entries, map_from_entries, element_at; binary_len for bytes columns.

PySpark-named wrappers: pydantable.pyspark.sql.functions mirrors much of the above (parity table).


Recent releases

0.17.0 — Tighter docs and tests for map_get / map_contains_key after PyArrow map<utf8, …> ingest; more pyspark.sql.functions thin wrappers (str_replace, regexp_replace, strip_*, strptime, binary_len, list_*). Non-string map keys (dict[int, T], etc.) remain future work (Roadmap Later).

0.16.x — Arrow interchange (read_parquet / read_ipc, to_arrow / ato_arrow, Table/RecordBatch constructors), FastAPI multipart and deployment docs, map-column arithmetic TypeError fix, DataFrame[Schema](pa.Table) constructor fix.

Older highlights: 0.15.0 async materialization and Arrow map ingest; 0.14.0 window null ordering and FastAPI TestClient coverage. Full history: Changelog.


Development

From a clone with .venv and pip install -e ".[dev]" plus a built extension:

make check-full              # Ruff, mypy, Rust fmt / clippy / tests
PYTHONPATH=python pytest -q  # integration tests (see DEVELOPER.md)

Rust tests need the Makefile PYO3_PYTHON / PYTHONPATH wiring: make rust-test. Details: Developer guide.


License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydantable-0.17.0.tar.gz (143.1 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pydantable-0.17.0-cp313-cp313-win_arm64.whl (17.2 MB view details)

Uploaded CPython 3.13Windows ARM64

pydantable-0.17.0-cp313-cp313-macosx_11_0_arm64.whl (17.2 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

pydantable-0.17.0-cp313-cp313-macosx_10_12_x86_64.whl (18.8 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

pydantable-0.17.0-cp312-cp312-win_amd64.whl (19.0 MB view details)

Uploaded CPython 3.12Windows x86-64

pydantable-0.17.0-cp312-cp312-musllinux_1_2_x86_64.whl (17.3 MB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ x86-64

pydantable-0.17.0-cp312-cp312-musllinux_1_2_aarch64.whl (15.8 MB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ ARM64

pydantable-0.17.0-cp312-cp312-macosx_11_0_arm64.whl (17.2 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

pydantable-0.17.0-cp312-cp312-macosx_10_12_x86_64.whl (18.8 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

pydantable-0.17.0-cp311-cp311-manylinux_2_28_aarch64.whl (18.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

pydantable-0.17.0-cp311-cp311-macosx_11_0_arm64.whl (17.2 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

pydantable-0.17.0-cp311-cp311-macosx_10_12_x86_64.whl (18.8 MB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

pydantable-0.17.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (19.0 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

File details

Details for the file pydantable-0.17.0.tar.gz.

File metadata

  • Download URL: pydantable-0.17.0.tar.gz
  • Upload date:
  • Size: 143.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for pydantable-0.17.0.tar.gz
Algorithm Hash digest
SHA256 5c24b0872520039848ce46518b6bf357460501aef5dd72a1ed8ddd1d2e247bc6
MD5 d1cdc23f5348f2133d0f300ac567a64d
BLAKE2b-256 84be17e29e7bc2cc955cbd79371549335b4c3d99c206a71f43e3d235c27eadea

See more details on using hashes here.

File details

Details for the file pydantable-0.17.0-cp313-cp313-win_arm64.whl.

File metadata

File hashes

Hashes for pydantable-0.17.0-cp313-cp313-win_arm64.whl
Algorithm Hash digest
SHA256 33dd6bab139ecc84eb13ab7a35b3145249020db708614efde321bcaf05f6c643
MD5 7355710d239d15940d300e6ac9b74462
BLAKE2b-256 d109c3ea33570ce89831d056292268303c9713e6c47dac89264ee04457b99d87

See more details on using hashes here.

File details

Details for the file pydantable-0.17.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pydantable-0.17.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4df4340b063fc888c95cc3840eded0df59fb096b7a994e36e4b04f26f3cdc427
MD5 0156c3a4cf1514461e1bfe1ed27b8189
BLAKE2b-256 f11db2f125a5af325b9111220965d34f5a0fb9d6a8c301a858ddbfbdfd7c771e

See more details on using hashes here.

File details

Details for the file pydantable-0.17.0-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for pydantable-0.17.0-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 55156a34103a65790a1ff8b1bd66ca0706814337a5ca07195ef5b8a15bf95b0b
MD5 0376873190114393522ec3329c0db54c
BLAKE2b-256 30e79a9e4368390a6a2066062c2a050f746fe4ef5cd14dbfa93f42e82cf25aa8

See more details on using hashes here.

File details

Details for the file pydantable-0.17.0-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for pydantable-0.17.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 b8f00dff4424f8559c20daa5deaa9935204c0affacdbd579fbc819bf5aaaa879
MD5 9bf9ef9f18571f8861a987b275107ed8
BLAKE2b-256 4d71858cc4a7d4a3385c158e11adcedbb6ee835d1018a8e196105b84b8e05b96

See more details on using hashes here.

File details

Details for the file pydantable-0.17.0-cp312-cp312-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for pydantable-0.17.0-cp312-cp312-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 2e71f452ed2fe8fa43e3578b343960c160796487c0e5cdfaee33b3ae749f95b3
MD5 8ea04d6f74a9db12e01678ebd77da125
BLAKE2b-256 5a8242f526be25c819150b6e3600a4ae8e1bbfd48f84764ef56f9f63a6d38993

See more details on using hashes here.

File details

Details for the file pydantable-0.17.0-cp312-cp312-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for pydantable-0.17.0-cp312-cp312-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 3bfcb699374c1e399a67ca46d75a1e9b74f62802bf7682d5152f32f1b527c281
MD5 117426f022564cd4df95b311cb351a7a
BLAKE2b-256 7faa8ac64875bf8c8de617657093e1236b45abd7e52a60bbe590f4be71d8a7a1

See more details on using hashes here.

File details

Details for the file pydantable-0.17.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pydantable-0.17.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 57a683801e3219ee95ac89b0e3d2456a952f933616282eaed65c4d39641cebc5
MD5 cf3311efdc51d51fd729fa0a01434169
BLAKE2b-256 92c99348e93145cfe9d795bf45358f26bd0aceecd3082295257a922034cc3f4a

See more details on using hashes here.

File details

Details for the file pydantable-0.17.0-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for pydantable-0.17.0-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 30817570c409c01458866da2122eaa6df5d92da1b112f4d48f30331557579693
MD5 319b253589a2046f5a3f9b44465b8318
BLAKE2b-256 ac826c2e7549a5459611a05be3d20f24d784a48bc126d7adae2d0ae86059cc61

See more details on using hashes here.

File details

Details for the file pydantable-0.17.0-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for pydantable-0.17.0-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 061f8014181e5bd98cbdfa944ef1e5abd851bce1b546467f78ee7e1c0bbea325
MD5 682b8fabaa4036ecd9e9d473dbcad7ea
BLAKE2b-256 5bf304bdaae8940f073e783c4bf2d6ddd324bec617869c7b6c2e5ae9adfff287

See more details on using hashes here.

File details

Details for the file pydantable-0.17.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pydantable-0.17.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 0464e4ba84331854376a3668ca836d307fd3d49c6a9602ec653ff7915e0e1060
MD5 c06fbdb27bc469b753bb209803ccc708
BLAKE2b-256 574de6a4f50fdad96a32c112415eb07964f8c98b74d1191f0e3e77a6b928c1d0

See more details on using hashes here.

File details

Details for the file pydantable-0.17.0-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for pydantable-0.17.0-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 19f0b4d0dd9b0a471e73b42e605affc44159caeb8f3af01a5cab95d2002d5b23
MD5 fd7ae3746daeb4ed4c7f6f39aabc9ca6
BLAKE2b-256 99f6ed6e0a881fcd2233a1cc8cc37e19def7d29e03714c36b63ddf2659f2a9f0

See more details on using hashes here.

File details

Details for the file pydantable-0.17.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pydantable-0.17.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e06b2d7dba861ad87875323e8aec2f97fdf1f9bf65899230d9110398ecd76c31
MD5 33ab163fe8395e431e0e88ab22973b19
BLAKE2b-256 bbc7552dd744b2b4d1d8530c0605adf7aa25a128e95c23c07c417223c0b60323

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page