Skip to main content

Strongly-typed DataFrames for Python, powered by Rust.

Project description

PydanTable

CI Documentation PyPI version Python versions License: MIT

Typed dataframe transformations for FastAPI and Pydantic services, backed by a Rust execution core.

Current release: 0.16.1 · Python 3.10+


Documentation

The full manual lives on Read the Docs:

https://pydantable.readthedocs.io/en/latest/

That site is the supported entry point for concepts, contracts, API notes, and examples. The sections below point to the same pages so you can jump straight from GitHub.

Topic Read the Docs
Home / overview Documentation home
DataFrameModel contract (inputs, transforms, collisions, materialization) DataFrameModel (SQLModel-like)
Column types (scalars, structs, list[T], nullability, unsupported cases) Supported data types
FastAPI (routers, bodies, collect, responses) FastAPI integration
Execution model (collect, to_dict, to_polars, optional Python Polars, UI modules) Execution (Rust engine)
Semantics (nulls, joins, ordering, reshaping, windows — Polars-style contract) Interface contract
Roadmap (shipped through 0.16.1, planned 0.17, path to v1.0.0) Roadmap
Why not use Polars directly? Why not just use Polars?
Pandas-style imports (pydantable.pandas) Pandas UI
PySpark-style imports (pydantable.pyspark) PySpark UI
PySpark helpers & parity PySpark interface · PySpark API parity
Polars parity (scorecard, workflows, transformation roadmap) Parity scorecard · Polars-style workflows · Transformation parity roadmap
Contributors (build, test, benchmarks, releases) Developer guide
Plan / vision (architecture phasing) Plan document
Python API reference (autodoc) API reference

For copy-paste convenience, the site base URL is:

https://pydantable.readthedocs.io/en/latest/


What PydanTable does

PydanTable keeps Pydantic models as the source of truth for:

  • column types and nullability (Optional[T] / T | None)
  • typed expressions — invalid combinations fail when the expression is built (Rust AST), not only at runtime
  • schema evolution — chained transforms produce new model types with stable rules (e.g. with_columns name collisions)

The default API feels Polars-like; optional pydantable.pandas and pydantable.pyspark modules only change naming and imports — execution is always the native core. Details: Execution, Interface contract.

0.16.1 fixes Rust expression typing so invalid arithmetic on dict[str, T] map columns (for example df.m + 1) raises TypeError instead of panicking, and fixes DataFrame[Schema](pa.Table) / RecordBatch construction (correct pydantable.io imports in validate_columns_strict). 0.16.0 adds read_parquet / read_ipc, to_arrow / ato_arrow, Table / RecordBatch constructor ingest ( pyarrow; pydantable[arrow] ), and FastAPI hardening (multipart uploads, Depends executors, background tasks, HTTP status notes). 0.15.0 added async materialization (acollect, ato_dict, ato_polars, and DataFrameModel arows / ato_dicts), FastAPI async + lifespan examples, PyArrow map<utf8, …> ingest for dict[str, T] columns, PySpark trim / abs / round / floor / ceil, and removed the legacy validate_data constructor argument — use trusted_mode only on DataFrame / DataFrameModel. 0.14.0 added window orderBy(..., nulls_last=...), DtypeDriftWarning, validate_data deprecation (removed in 0.15.0), FastAPI TestClient docs/tests, and PySpark dayofmonth / lower / upper. See changelog and Roadmap.

Expression surface (current release, Rust-typed Expr):

  • Globals in select: global_sum, global_mean, global_count, global_min, global_max, and global_row_count() (row count / COUNT(*)). PySpark: F.count() with no argument for row count; F.count(F.col(...)) for non-null column count.
  • Windows: row_number, rank, dense_rank, window_sum, window_mean, window_min, window_max, lag, lead with Window.partitionBy(...).orderBy(..., nulls_last=...) / .spec(), plus framed windows (rowsBetween, rangeBetween) for supported operations.
  • Temporal: strptime, unix_timestamp, cast from strdate / datetime (Polars parsing; use strptime for fixed formats), dt_* parts, dt_nanosecond on datetime / time.
  • Maps / binary: map_len, map_get, map_contains_key, binary_len, including nested JSON-like map value dtypes with string keys.
  • Map utilities: map_keys(), map_values(), map_entries(), and map_from_entries() for per-row key/value extraction and reconstruction on dict[str, T] columns.

PySpark-named helpers live under pydantable.pyspark.sql.functions. Details: Supported types, Interface contract, CHANGELOG.


Install

pip install pydantable

Optional Python Polars (for to_polars() only):

pip install 'pydantable[polars]'

From a git checkout, the Rust extension must be built (e.g. with maturin):

pip install .

See Developer guide — local setup for maturin develop, release builds, and CI parity.


Quick start

from pydantable import DataFrameModel

class User(DataFrameModel):
    id: int
    age: int | None

df = User({"id": [1, 2], "age": [20, None]})
df2 = df.with_columns(age2=df.age * 2)
df3 = df2.select("id", "age2")
df4 = df3.filter(df3.age2 > 10)

print(df4.to_dict())

Example output:

{'age2': [40], 'id': [1]}
  • Materialization: collect() returns a list of Pydantic row models; to_dict() returns columnar dict[str, list]; to_arrow / ato_arrow (optional pyarrow) return a PyArrow Table. 0.15.0 acollect / ato_dict / ato_polars run the same work off the asyncio loop (ato_arrow in 0.16.0).

  • Alternate UIs:

    from pydantable.pandas import DataFrameModel as PandasDataFrameModel
    from pydantable.pyspark import DataFrameModel as PySparkDataFrameModel
    from pydantable import DataFrameModel as DefaultDataFrameModel
    

More examples: FastAPI integration, Polars-style workflows.

Input quality policy (optional): constructors are strict by default, and can be switched to best-effort ingestion with ignore_errors=True plus on_validation_errors=... to receive failed rows (row_index, row, validation errors). See DataFrameModel docs.


Development

make check-full   # Ruff, mypy, Rust fmt/clippy/tests (see Makefile for `rust-test` env)
pytest -q         # or: pytest -n auto  with the [dev] extra

Rust + Python: see Developer guide (formatting, maturin, make rust-test for cargo test with the venv PYTHONPATH, benchmarks, contribution workflow).


License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydantable-0.16.1.tar.gz (142.7 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pydantable-0.16.1-cp313-cp313-win_arm64.whl (17.2 MB view details)

Uploaded CPython 3.13Windows ARM64

pydantable-0.16.1-cp313-cp313-macosx_11_0_arm64.whl (17.2 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

pydantable-0.16.1-cp313-cp313-macosx_10_12_x86_64.whl (18.8 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

pydantable-0.16.1-cp312-cp312-win_amd64.whl (19.0 MB view details)

Uploaded CPython 3.12Windows x86-64

pydantable-0.16.1-cp312-cp312-musllinux_1_2_x86_64.whl (17.3 MB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ x86-64

pydantable-0.16.1-cp312-cp312-musllinux_1_2_aarch64.whl (15.8 MB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ ARM64

pydantable-0.16.1-cp312-cp312-macosx_11_0_arm64.whl (17.2 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

pydantable-0.16.1-cp312-cp312-macosx_10_12_x86_64.whl (18.8 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

pydantable-0.16.1-cp311-cp311-manylinux_2_28_aarch64.whl (18.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ ARM64

pydantable-0.16.1-cp311-cp311-macosx_11_0_arm64.whl (17.2 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

pydantable-0.16.1-cp311-cp311-macosx_10_12_x86_64.whl (18.8 MB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

pydantable-0.16.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (19.0 MB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

File details

Details for the file pydantable-0.16.1.tar.gz.

File metadata

  • Download URL: pydantable-0.16.1.tar.gz
  • Upload date:
  • Size: 142.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for pydantable-0.16.1.tar.gz
Algorithm Hash digest
SHA256 2ffb154efcfe85d559b01e65619531ee50613e83f821280a53036c4f5ea560d9
MD5 235636ea65fbc40c103d1b141a6a66bf
BLAKE2b-256 4a6b4c2320e63b044080b5fbba3c8bd9e879c31057137cf1dd186202c3cf2e5d

See more details on using hashes here.

File details

Details for the file pydantable-0.16.1-cp313-cp313-win_arm64.whl.

File metadata

File hashes

Hashes for pydantable-0.16.1-cp313-cp313-win_arm64.whl
Algorithm Hash digest
SHA256 0988a5cf9522ca44d29592f77eda1e92ff6d042c31477f284c99fabcd648f11d
MD5 3883344825a936b5e2fc6b3187265ae5
BLAKE2b-256 e543777904bba0c9a1876072b69df5d18e855c7015136dd39193353c3cf9d91c

See more details on using hashes here.

File details

Details for the file pydantable-0.16.1-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pydantable-0.16.1-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 268aef9c9c60ce07a7a970b33847ad36bb1e262f85c2dba6f04c6e8180a5b95f
MD5 b3615563e3532b14cf1f75d63d1981bb
BLAKE2b-256 6dd8a9421b89aa4d86a0fe565900d67e881a48bebae0129dc42f145803489152

See more details on using hashes here.

File details

Details for the file pydantable-0.16.1-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for pydantable-0.16.1-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 b9a66578f34c96abd473171ddf06a173e7b1f221fd2f82b44b3a84a4f8fc3e66
MD5 f948afe05ec4ac443b1e8b9ad1bb4484
BLAKE2b-256 88df01648214e27c73c9ee1344084a8b172308b94b6fa920989aae4a0fa6ec6b

See more details on using hashes here.

File details

Details for the file pydantable-0.16.1-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for pydantable-0.16.1-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 5cbc87b31b89d8d6f20d858c91bc070d2a0c6fcb81adf16bb7a8cfc628661981
MD5 14e29bea570f85ba09d4f1bc0b5aec1d
BLAKE2b-256 7838ee98f8dfc5fcc83517e55f67929e841ae89d2ebe11c8071c5c57bc709c7c

See more details on using hashes here.

File details

Details for the file pydantable-0.16.1-cp312-cp312-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for pydantable-0.16.1-cp312-cp312-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 8852bdf514f0bf201285c5f2883de51667cc2a8e474441b987cbe3e0ca1f2a6c
MD5 e7301b2a1d8bf2075bea93a81bbad02d
BLAKE2b-256 b06d47d48d51b8ad59a8593d1c1de47bbbd13a52c8210fba3a438f7dcd682613

See more details on using hashes here.

File details

Details for the file pydantable-0.16.1-cp312-cp312-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for pydantable-0.16.1-cp312-cp312-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 787b61d33927fb2b2372c7cf197638888161e621dd1281451214f0f969d04582
MD5 929a02851885b3cb253321c179181242
BLAKE2b-256 71b19fade3e5f5fa5b416d0a35570795640ced36c92f9d621c3a58dc4a5b4d39

See more details on using hashes here.

File details

Details for the file pydantable-0.16.1-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pydantable-0.16.1-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 ddc7598de9707373f585a857d5be1312624b2f221eb45a40e0dd5aed4a7647e8
MD5 b3a39f287cd29bb1082adbae1dfe3cd3
BLAKE2b-256 e0dc67f8ea59f7019699f77793494afead28cb5ac944196e7f40bcc09c01cf49

See more details on using hashes here.

File details

Details for the file pydantable-0.16.1-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for pydantable-0.16.1-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 e278d282c67b58b98b2f9ad7e29cc0410442e09ca1ebe0a79432c70854a1ff7a
MD5 fb0d252f4f159e9453724249aa8b7d58
BLAKE2b-256 5ddfc37d729abd4ffc9be8b1b7d00d0e7733f4531f913e1af03df45becd07f96

See more details on using hashes here.

File details

Details for the file pydantable-0.16.1-cp311-cp311-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for pydantable-0.16.1-cp311-cp311-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 beb680fda2de40aa49250fc74ba054c69ee0be13153da3b03b2b789ac8a2b9b6
MD5 3755e2d75757765a648a6b91ea4dfe4f
BLAKE2b-256 4dc07792df95a0fc6516e54bc5ee97d3244e1873dcab372aa9427bea9fc8b50e

See more details on using hashes here.

File details

Details for the file pydantable-0.16.1-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pydantable-0.16.1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 bf4705c164dbd5aba3bb499e761137330c51c8caff81727e181f1f5707c9c809
MD5 e58c4ff74a4259350a6343b0011ac754
BLAKE2b-256 3f2840eff8fd4ca366791f31660a60e616d1e4aeab29c4a10fc42bc228cb560e

See more details on using hashes here.

File details

Details for the file pydantable-0.16.1-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for pydantable-0.16.1-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 d98b4f17af931cabbf6dbcd7f4ac3007aabe0e1207fcc6ebcbcb469218e7d88c
MD5 0ee7acc3aef675c6c4f297f7e6c90940
BLAKE2b-256 12b10f56143650d663880168c3215371028a05db60ca5db6e716108834fea62c

See more details on using hashes here.

File details

Details for the file pydantable-0.16.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pydantable-0.16.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 54c514e86c007157837f1277cdd802bccd4248964e6b841f254b2d1789e2f534
MD5 6ddc19d27ff89cdd1e3f825ce563c439
BLAKE2b-256 a62983a7da830bbf9234d160e392a8311a6e9534aafc75d3a6b73c55ad768cb7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page