Skip to main content

Strongly-typed DataFrames for Python, powered by Rust.

Project description

PydanTable

CI Documentation PyPI version Python versions License: MIT

Typed dataframe transformations for FastAPI and Pydantic services, backed by a Rust execution core.

Current release: 0.15.0 · Python 3.10+


Documentation

The full manual lives on Read the Docs:

https://pydantable.readthedocs.io/en/latest/

That site is the supported entry point for concepts, contracts, API notes, and examples. The sections below point to the same pages so you can jump straight from GitHub.

Topic Read the Docs
Home / overview Documentation home
DataFrameModel contract (inputs, transforms, collisions, materialization) DataFrameModel (SQLModel-like)
Column types (scalars, structs, list[T], nullability, unsupported cases) Supported data types
FastAPI (routers, bodies, collect, responses) FastAPI integration
Execution model (collect, to_dict, to_polars, optional Python Polars, UI modules) Execution (Rust engine)
Semantics (nulls, joins, ordering, reshaping, windows — Polars-style contract) Interface contract
Roadmap (shipped through 0.15.0, planned 0.17–0.18, path to v1.0.0) Roadmap
Why not use Polars directly? Why not just use Polars?
Pandas-style imports (pydantable.pandas) Pandas UI
PySpark-style imports (pydantable.pyspark) PySpark UI
PySpark helpers & parity PySpark interface · PySpark API parity
Polars parity (scorecard, workflows, transformation roadmap) Parity scorecard · Polars-style workflows · Transformation parity roadmap
Contributors (build, test, benchmarks, releases) Developer guide
Plan / vision (architecture phasing) Plan document
Python API reference (autodoc) API reference

For copy-paste convenience, the site base URL is:

https://pydantable.readthedocs.io/en/latest/


What PydanTable does

PydanTable keeps Pydantic models as the source of truth for:

  • column types and nullability (Optional[T] / T | None)
  • typed expressions — invalid combinations fail when the expression is built (Rust AST), not only at runtime
  • schema evolution — chained transforms produce new model types with stable rules (e.g. with_columns name collisions)

The default API feels Polars-like; optional pydantable.pandas and pydantable.pyspark modules only change naming and imports — execution is always the native core. Details: Execution, Interface contract.

0.15.0 adds async materialization (acollect, ato_dict, ato_polars, and DataFrameModel arows / ato_dicts), FastAPI async + lifespan examples, PyArrow map<utf8, …> ingest for dict[str, T] columns, PySpark trim / abs / round / floor / ceil, and removes the legacy validate_data constructor argument — use trusted_mode only on DataFrame / DataFrameModel. 0.14.0 added window orderBy(..., nulls_last=...), DtypeDriftWarning, validate_data deprecation (removed in 0.15.0), FastAPI TestClient docs/tests, and PySpark dayofmonth / lower / upper. See changelog and Roadmap.

Expression surface (current release, Rust-typed Expr):

  • Globals in select: global_sum, global_mean, global_count, global_min, global_max, and global_row_count() (row count / COUNT(*)). PySpark: F.count() with no argument for row count; F.count(F.col(...)) for non-null column count.
  • Windows: row_number, rank, dense_rank, window_sum, window_mean, window_min, window_max, lag, lead with Window.partitionBy(...).orderBy(..., nulls_last=...) / .spec(), plus framed windows (rowsBetween, rangeBetween) for supported operations.
  • Temporal: strptime, unix_timestamp, cast from strdate / datetime (Polars parsing; use strptime for fixed formats), dt_* parts, dt_nanosecond on datetime / time.
  • Maps / binary: map_len, map_get, map_contains_key, binary_len, including nested JSON-like map value dtypes with string keys.
  • Map utilities: map_keys(), map_values(), map_entries(), and map_from_entries() for per-row key/value extraction and reconstruction on dict[str, T] columns.

PySpark-named helpers live under pydantable.pyspark.sql.functions. Details: Supported types, Interface contract, CHANGELOG.


Install

pip install pydantable

Optional Python Polars (for to_polars() only):

pip install 'pydantable[polars]'

From a git checkout, the Rust extension must be built (e.g. with maturin):

pip install .

See Developer guide — local setup for maturin develop, release builds, and CI parity.


Quick start

from pydantable import DataFrameModel

class User(DataFrameModel):
    id: int
    age: int | None

df = User({"id": [1, 2], "age": [20, None]})
df2 = df.with_columns(age2=df.age * 2)
df3 = df2.select("id", "age2")
df4 = df3.filter(df3.age2 > 10)

print(df4.to_dict())

Example output:

{'age2': [40], 'id': [1]}
  • Materialization: collect() returns a list of Pydantic row models; to_dict() returns columnar dict[str, list]. 0.15.0 acollect / ato_dict / ato_polars run the same work off the asyncio loop.

  • Alternate UIs:

    from pydantable.pandas import DataFrameModel as PandasDataFrameModel
    from pydantable.pyspark import DataFrameModel as PySparkDataFrameModel
    from pydantable import DataFrameModel as DefaultDataFrameModel
    

More examples: FastAPI integration, Polars-style workflows.

Input quality policy (optional): constructors are strict by default, and can be switched to best-effort ingestion with ignore_errors=True plus on_validation_errors=... to receive failed rows (row_index, row, validation errors). See DataFrameModel docs.


Development

make check-full   # Ruff, mypy, Rust fmt/clippy/tests (see Makefile for `rust-test` env)
pytest -q         # or: pytest -n auto  with the [dev] extra

Rust + Python: see Developer guide (formatting, maturin, make rust-test for cargo test with the venv PYTHONPATH, benchmarks, contribution workflow).


License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydantable-0.15.0.tar.gz (136.9 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

pydantable-0.15.0-cp313-cp313-win_arm64.whl (17.2 MB view details)

Uploaded CPython 3.13Windows ARM64

pydantable-0.15.0-cp313-cp313-macosx_11_0_arm64.whl (15.4 MB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

pydantable-0.15.0-cp313-cp313-macosx_10_12_x86_64.whl (17.0 MB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

pydantable-0.15.0-cp312-cp312-win_amd64.whl (18.9 MB view details)

Uploaded CPython 3.12Windows x86-64

pydantable-0.15.0-cp312-cp312-musllinux_1_2_x86_64.whl (17.3 MB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ x86-64

pydantable-0.15.0-cp312-cp312-musllinux_1_2_aarch64.whl (15.8 MB view details)

Uploaded CPython 3.12musllinux: musl 1.2+ ARM64

pydantable-0.15.0-cp312-cp312-manylinux_2_28_aarch64.whl (15.8 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ ARM64

pydantable-0.15.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.3 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

pydantable-0.15.0-cp312-cp312-macosx_11_0_arm64.whl (15.4 MB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

pydantable-0.15.0-cp312-cp312-macosx_10_12_x86_64.whl (17.0 MB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

pydantable-0.15.0-cp311-cp311-macosx_11_0_arm64.whl (15.4 MB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

pydantable-0.15.0-cp311-cp311-macosx_10_12_x86_64.whl (17.0 MB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

File details

Details for the file pydantable-0.15.0.tar.gz.

File metadata

  • Download URL: pydantable-0.15.0.tar.gz
  • Upload date:
  • Size: 136.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.12.6

File hashes

Hashes for pydantable-0.15.0.tar.gz
Algorithm Hash digest
SHA256 48cc0c9200b404204ffe83c7b839b768701b796d48b99ec8dbc5abaf697d66c2
MD5 d91affe4546d665963f1bdf36b0befad
BLAKE2b-256 c071b6a6f347e959dbff11f64df30d55c9bd64bdf4263f2c74d30f8edd0df107

See more details on using hashes here.

File details

Details for the file pydantable-0.15.0-cp313-cp313-win_arm64.whl.

File metadata

File hashes

Hashes for pydantable-0.15.0-cp313-cp313-win_arm64.whl
Algorithm Hash digest
SHA256 f66ef354e205e3059358e47f89bacdb773856ea6762ae081617c24ca85db97d2
MD5 298a51a9c6a02f12d9602e6d2f18c4e8
BLAKE2b-256 c6035f1e0655e7b8141415fe7803197bf3ec79f296e9e618d1da6ba22b98d1ba

See more details on using hashes here.

File details

Details for the file pydantable-0.15.0-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pydantable-0.15.0-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 79b35e2025cf9044f08c59b2c2aff56585f694a0b097cb36d3b929c6d3ef7c3c
MD5 cf4508d39621e43fceb60096a9307bd2
BLAKE2b-256 7f2d98dc6dc27d6d95e44a970d21d3a12d6d7c707c9aacd1c61012d12f090692

See more details on using hashes here.

File details

Details for the file pydantable-0.15.0-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for pydantable-0.15.0-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 8ae478c851a9414c667ffc6afef73c39f59a39d953fc996554fafb80f3ee4d27
MD5 1d2e40e706500ad46e83cd4b65d7f425
BLAKE2b-256 6db7ee590d6e397634455594c20ab5160f9a9f049a86656b6caa897012b5a5f6

See more details on using hashes here.

File details

Details for the file pydantable-0.15.0-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for pydantable-0.15.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 92adf4463ed293e0e003d3a597ed9aec84bf7910e41d2a7c374e90704181aa17
MD5 d70f8e8968f7aa257eac0e6bd7535584
BLAKE2b-256 73cea5eb146e4efb21815ab57c215ee9eb93bf681ef6ddb1ea2098192c6adb50

See more details on using hashes here.

File details

Details for the file pydantable-0.15.0-cp312-cp312-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for pydantable-0.15.0-cp312-cp312-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 0856e1bd4856d73f871922a51199c860c28a6c5437bbb21272477c3fdea328fd
MD5 f8182a128c9689b1edc37b20a4f565d7
BLAKE2b-256 259500ab834fd80a479b8a088d9b2fb0f951b9026b2e3f23030059283391a85f

See more details on using hashes here.

File details

Details for the file pydantable-0.15.0-cp312-cp312-musllinux_1_2_aarch64.whl.

File metadata

File hashes

Hashes for pydantable-0.15.0-cp312-cp312-musllinux_1_2_aarch64.whl
Algorithm Hash digest
SHA256 9e59ee1b2e45cd60da8784b5e72dad786e8da210447ebb8728b5c3410dfbc298
MD5 0a70a3050dc40fbc9d7de6a9f8732a65
BLAKE2b-256 b5ab00c26c67c3742bd4c8b3957059c669e8b07afb92d95185bd1ded3180bb12

See more details on using hashes here.

File details

Details for the file pydantable-0.15.0-cp312-cp312-manylinux_2_28_aarch64.whl.

File metadata

File hashes

Hashes for pydantable-0.15.0-cp312-cp312-manylinux_2_28_aarch64.whl
Algorithm Hash digest
SHA256 6cf2520fbd35890eec37e301aff84b683cc4e9a12be24b67390dd0f1aab994e0
MD5 a67c2afdd188f2b051fb543ddf10edf4
BLAKE2b-256 48bc5f4408debc26b9a1f4bd00280629b8518d69702c70f5016eb4ac44e8cbec

See more details on using hashes here.

File details

Details for the file pydantable-0.15.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pydantable-0.15.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 df971adb8b21a40751a0fe64b53e5c366592308a25390a7cb1b56e92b892d267
MD5 ee5a7e42af8f6947f16b7b718022d4dd
BLAKE2b-256 8f4dbb1f45be3f73f1242b943acd7a902d6f72462249fcc91ae6c9ca2fd09ee3

See more details on using hashes here.

File details

Details for the file pydantable-0.15.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pydantable-0.15.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 404cc2d9cb89c75072ba4eb650b0a6112840ceea13ab8963667d587c6ab1c28f
MD5 dcd4665d24412b206705182aa8b87a28
BLAKE2b-256 42af988ad3a498c4bf9753e6fc9a4fd1a88b6e70ae228017a0438cf126f00b61

See more details on using hashes here.

File details

Details for the file pydantable-0.15.0-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for pydantable-0.15.0-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 d86d8e4461467d5460c6a7d50cbf145f9a233bd14ac4654312daafda1583f654
MD5 358d224e99bf0563564490233020b4d1
BLAKE2b-256 0f53b5d9c54c7fac348f61d93c74551fd5bde50a9d6d0a03b21b27ef1bf9d82a

See more details on using hashes here.

File details

Details for the file pydantable-0.15.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pydantable-0.15.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 2e6f3f7925a290bed45bb6a6dda0bc7bfc1af42072dae6fed00d61b9917ba640
MD5 a1387f5424917c5d48eac93a203db976
BLAKE2b-256 89ab3b8d4783d7d5669a81feab595b959c239aedb9f2e9bc2b7ed1d7d81e9b3e

See more details on using hashes here.

File details

Details for the file pydantable-0.15.0-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for pydantable-0.15.0-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 7bdb59cd51664e93282f9fda3cad08e55a45e37132d0732de5d6a3fd0c633985
MD5 6d172573dafbc24d412581b9b0530381
BLAKE2b-256 edf597c8fb396a1bb89a42eb75fc6e97dffa05c4d67f4237e4419815ae34e3bb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page