Skip to main content

Strongly-typed DataFrames for Python, powered by Rust.

Project description

PydanTable

CI Documentation PyPI version Python versions License: MIT

Typed dataframe transformations for FastAPI and Pydantic services, backed by a Rust execution core.

Current release: 0.16.0 · Python 3.10+


Documentation

The full manual lives on Read the Docs:

https://pydantable.readthedocs.io/en/latest/

That site is the supported entry point for concepts, contracts, API notes, and examples. The sections below point to the same pages so you can jump straight from GitHub.

Topic Read the Docs
Home / overview Documentation home
DataFrameModel contract (inputs, transforms, collisions, materialization) DataFrameModel (SQLModel-like)
Column types (scalars, structs, list[T], nullability, unsupported cases) Supported data types
FastAPI (routers, bodies, collect, responses) FastAPI integration
Execution model (collect, to_dict, to_polars, optional Python Polars, UI modules) Execution (Rust engine)
Semantics (nulls, joins, ordering, reshaping, windows — Polars-style contract) Interface contract
Roadmap (shipped through 0.16.0, planned 0.17, path to v1.0.0) Roadmap
Why not use Polars directly? Why not just use Polars?
Pandas-style imports (pydantable.pandas) Pandas UI
PySpark-style imports (pydantable.pyspark) PySpark UI
PySpark helpers & parity PySpark interface · PySpark API parity
Polars parity (scorecard, workflows, transformation roadmap) Parity scorecard · Polars-style workflows · Transformation parity roadmap
Contributors (build, test, benchmarks, releases) Developer guide
Plan / vision (architecture phasing) Plan document
Python API reference (autodoc) API reference

For copy-paste convenience, the site base URL is:

https://pydantable.readthedocs.io/en/latest/


What PydanTable does

PydanTable keeps Pydantic models as the source of truth for:

  • column types and nullability (Optional[T] / T | None)
  • typed expressions — invalid combinations fail when the expression is built (Rust AST), not only at runtime
  • schema evolution — chained transforms produce new model types with stable rules (e.g. with_columns name collisions)

The default API feels Polars-like; optional pydantable.pandas and pydantable.pyspark modules only change naming and imports — execution is always the native core. Details: Execution, Interface contract.

0.16.0 adds read_parquet / read_ipc, to_arrow / ato_arrow, Table / RecordBatch constructor ingest ( pyarrow; pydantable[arrow] ), and FastAPI hardening (multipart uploads, Depends executors, background tasks, HTTP status notes). 0.15.0 added async materialization (acollect, ato_dict, ato_polars, and DataFrameModel arows / ato_dicts), FastAPI async + lifespan examples, PyArrow map<utf8, …> ingest for dict[str, T] columns, PySpark trim / abs / round / floor / ceil, and removed the legacy validate_data constructor argument — use trusted_mode only on DataFrame / DataFrameModel. 0.14.0 added window orderBy(..., nulls_last=...), DtypeDriftWarning, validate_data deprecation (removed in 0.15.0), FastAPI TestClient docs/tests, and PySpark dayofmonth / lower / upper. See changelog and Roadmap.

Expression surface (current release, Rust-typed Expr):

  • Globals in select: global_sum, global_mean, global_count, global_min, global_max, and global_row_count() (row count / COUNT(*)). PySpark: F.count() with no argument for row count; F.count(F.col(...)) for non-null column count.
  • Windows: row_number, rank, dense_rank, window_sum, window_mean, window_min, window_max, lag, lead with Window.partitionBy(...).orderBy(..., nulls_last=...) / .spec(), plus framed windows (rowsBetween, rangeBetween) for supported operations.
  • Temporal: strptime, unix_timestamp, cast from strdate / datetime (Polars parsing; use strptime for fixed formats), dt_* parts, dt_nanosecond on datetime / time.
  • Maps / binary: map_len, map_get, map_contains_key, binary_len, including nested JSON-like map value dtypes with string keys.
  • Map utilities: map_keys(), map_values(), map_entries(), and map_from_entries() for per-row key/value extraction and reconstruction on dict[str, T] columns.

PySpark-named helpers live under pydantable.pyspark.sql.functions. Details: Supported types, Interface contract, CHANGELOG.


Install

pip install pydantable

Optional Python Polars (for to_polars() only):

pip install 'pydantable[polars]'

From a git checkout, the Rust extension must be built (e.g. with maturin):

pip install .

See Developer guide — local setup for maturin develop, release builds, and CI parity.


Quick start

from pydantable import DataFrameModel

class User(DataFrameModel):
    id: int
    age: int | None

df = User({"id": [1, 2], "age": [20, None]})
df2 = df.with_columns(age2=df.age * 2)
df3 = df2.select("id", "age2")
df4 = df3.filter(df3.age2 > 10)

print(df4.to_dict())

Example output:

{'age2': [40], 'id': [1]}
  • Materialization: collect() returns a list of Pydantic row models; to_dict() returns columnar dict[str, list]; to_arrow / ato_arrow (optional pyarrow) return a PyArrow Table. 0.15.0 acollect / ato_dict / ato_polars run the same work off the asyncio loop (ato_arrow in 0.16.0).

  • Alternate UIs:

    from pydantable.pandas import DataFrameModel as PandasDataFrameModel
    from pydantable.pyspark import DataFrameModel as PySparkDataFrameModel
    from pydantable import DataFrameModel as DefaultDataFrameModel
    

More examples: FastAPI integration, Polars-style workflows.

Input quality policy (optional): constructors are strict by default, and can be switched to best-effort ingestion with ignore_errors=True plus on_validation_errors=... to receive failed rows (row_index, row, validation errors). See DataFrameModel docs.


Development

make check-full   # Ruff, mypy, Rust fmt/clippy/tests (see Makefile for `rust-test` env)
pytest -q         # or: pytest -n auto  with the [dev] extra

Rust + Python: see Developer guide (formatting, maturin, make rust-test for cargo test with the venv PYTHONPATH, benchmarks, contribution workflow).


License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydantable-0.16.0.tar.gz (138.8 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pydantable-0.16.0-cp313-cp313-win_arm64.whl (17.2 MB view details)

Uploaded CPython 3.13Windows ARM64

pydantable-0.16.0-cp313-cp313-macosx_11_0_arm64.whl (17.2 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

pydantable-0.16.0-cp313-cp313-macosx_10_12_x86_64.whl (18.8 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

pydantable-0.16.0-cp312-cp312-win_amd64.whl (18.9 MB view details)

Uploaded CPython 3.12Windows x86-64

pydantable-0.16.0-cp312-cp312-musllinux_1_2_x86_64.whl (17.3 MB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ x86-64

pydantable-0.16.0-cp312-cp312-musllinux_1_2_aarch64.whl (15.8 MB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ ARM64

pydantable-0.16.0-cp312-cp312-macosx_11_0_arm64.whl (17.2 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

pydantable-0.16.0-cp312-cp312-macosx_10_12_x86_64.whl (18.8 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

pydantable-0.16.0-cp311-cp311-manylinux_2_28_aarch64.whl (18.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

pydantable-0.16.0-cp311-cp311-macosx_11_0_arm64.whl (17.2 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

pydantable-0.16.0-cp311-cp311-macosx_10_12_x86_64.whl (18.8 MB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

pydantable-0.16.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (19.0 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

File details

Details for the file pydantable-0.16.0.tar.gz.

File metadata

  • Download URL: pydantable-0.16.0.tar.gz
  • Upload date:
  • Size: 138.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for pydantable-0.16.0.tar.gz
Algorithm Hash digest
SHA256 3243eeabd923e3a3c32b4e5673c7d9956b3faa34753b97738d5fc9494a1949bc
MD5 051d793af4855fa43960608aa1d7b849
BLAKE2b-256 202bc787c351ce3cc4adc043ce4ddb76149ec7dae0bfc9ef400488fc1024b30a

See more details on using hashes here.

File details

Details for the file pydantable-0.16.0-cp313-cp313-win_arm64.whl.

File metadata

File hashes

Hashes for pydantable-0.16.0-cp313-cp313-win_arm64.whl
Algorithm Hash digest
SHA256 f113cfde681cb734dfd95776a3b192ab07b808a1f335cb14f70c91e4fb1f1c70
MD5 970bfc603ced680f3016176c8a090dd3
BLAKE2b-256 66f6424e4cc32506310c9589d0348316441086c25cf1ba41710c605e6db2087b

See more details on using hashes here.

File details

Details for the file pydantable-0.16.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pydantable-0.16.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b664a712554006a0680f4eedfff8ba6779b71d47fa65f2311c308c4223edd5d2
MD5 9850cb2f90e8765c5dca105bfb2f450b
BLAKE2b-256 10fe488247b4a30ca4307b0bd6723afd910f22d36314449d8241583d1e2dc45f

See more details on using hashes here.

File details

Details for the file pydantable-0.16.0-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for pydantable-0.16.0-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 5dde9f9d1d8ee58c65e5d7a481c56254e94ab02f5b2b4768c7f799704edc48e0
MD5 b63673ce8a30f335f96afeb3ae0c7f12
BLAKE2b-256 39a2f7cf3e16151c8f04ad4f1de00e7d1d7e1db4a6bb3e9513c443a98158e9e8

See more details on using hashes here.

File details

Details for the file pydantable-0.16.0-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for pydantable-0.16.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 40fed94abf40fc2646a75a177c028e5630b3e296ba487b49c08f1218c621530e
MD5 0b74379734532f9ad130d32e01b9a2ae
BLAKE2b-256 25b5f23d320e9986070ee5440426c477ad83eaa20cde6edbba9f79094527fb66

See more details on using hashes here.

File details

Details for the file pydantable-0.16.0-cp312-cp312-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for pydantable-0.16.0-cp312-cp312-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 d8159e05c4f1ff6d8e3cf22a9250839f3b093bbd1ee5e2c21acb3836d2456d21
MD5 312fbaad7c583b7bf9f59ec5c854f8ba
BLAKE2b-256 a5f2a943b73298807a9898b68e16fcfba22840df7aa02e0b3d790125e6ba1215

See more details on using hashes here.

File details

Details for the file pydantable-0.16.0-cp312-cp312-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for pydantable-0.16.0-cp312-cp312-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 7b913c26a4fbf46271252bab945804e7fe138bcab498af00807f302eff0bcefe
MD5 88c5a75b13337222b7f70059ea150b8a
BLAKE2b-256 b3d91489bc30b1111a23ea9cb8393ebe7b47dc2af719adf0ba478867d493a914

See more details on using hashes here.

File details

Details for the file pydantable-0.16.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pydantable-0.16.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 159f774ae56d7edf03bb70ef940e3ff39c23f1834525dafe23e1a3d2d64ff949
MD5 875a9291168edfedeae453743f0119c9
BLAKE2b-256 bd8a72380fc6223d34bf26a2f14013b7a5bebf6c31fcad3523a4be5d02cc59db

See more details on using hashes here.

File details

Details for the file pydantable-0.16.0-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for pydantable-0.16.0-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 255af5ffa2787d0b1e130f83e43dc5c8b4120e58dd6e8c4320aea4dd59e94890
MD5 a34e9c57151b2b89c1ce2eb6fac55baa
BLAKE2b-256 94571c64411cec33057db603ee4837298f349eae1523fbe74e72cc4905f86f86

See more details on using hashes here.

File details

Details for the file pydantable-0.16.0-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for pydantable-0.16.0-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 90397e273ae86ce6ec13193fb1b1e8f0cbbfebc1e5e88cab7172d892d7efe25c
MD5 253c1d65d5b88258353fbb06909d6570
BLAKE2b-256 3d9c1581b1cb07c59dae932bbf5504d8f7c67e3ad75408064b62470421ead5b8

See more details on using hashes here.

File details

Details for the file pydantable-0.16.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pydantable-0.16.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 cf5cb23938a98c430155872d4a92c468705cf38017d922992b1254d70b5a8de0
MD5 87026f955bdd2c973a96bfe914fc7e40
BLAKE2b-256 da0d477a819beb915e56fb6aaf4349d6f63c8c0a1b85e71199d3a217270db28c

See more details on using hashes here.

File details

Details for the file pydantable-0.16.0-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for pydantable-0.16.0-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 13768cbd9316fb3beb76e48007321bfffd59f4f4e471bd0f450cbc521a3dab8e
MD5 8a7d7dac1e27db59ed3eb451fddeb9cb
BLAKE2b-256 d76c3adfc856800fb2105bc0df44554c165fe11383ac49c85eab017faf9f0dc9

See more details on using hashes here.

File details

Details for the file pydantable-0.16.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pydantable-0.16.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 508cdeb7b02daa4704bdbee57aceb509b761217823cb6d60927e9f211fec1f3d
MD5 4fd0c4f0f1eb50f8b41ad0c06c32dab5
BLAKE2b-256 f9c2ce96a23a52baf29b0605f1deab28311ef08df3b34cfc189d204d4f58fa35

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page