Strongly-typed DataFrames for Python, powered by Rust.
Project description
PydanTable
Strongly typed DataFrames for Python, powered by Rust — Pydantic schemas, Polars-backed execution in the native extension, and an API built for services (including optional FastAPI integration).
Current release: 1.17.0 — highlights in the changelog.
Why PydanTable
- One schema, many surfaces: define columns with Pydantic models; use
DataFrameModel(SQLModel-style) orDataFrame[YourSchema]. - Typed expressions:
Exprand transform chains are validated and lowered in Rust; many errors fail fast at build/plan time. - Familiar operations:
select,filter,join,group_by, windows, melt/pivot, and pandas-flavored helpers where they help. - Flexible materialization: row models via
collect()/rows(), columnardict[str, list], or Polars/PyArrow with the right extras. - I/O: lazy
read_*/aread_*, streaming writes, NDJSON/JSON Lines, Parquet, CSV, IPC, HTTP, SQL (SQLModel-firstfetch_sqlmodel/write_sqlmodel, explicit string SQLfetch_sql_raw/write_sql_raw, or deprecated unprefixed names), MongoDB eagerfetch_mongo/write_mongo(and async mirrors) withpydantable[mongo]— I/O overview, IO_SQL, MONGO_ENGINE, SQLModel roadmap, and decision tree. - JSON & struct columns: struct expressions, JSON encode/decode helpers, unnest/nested models — IO_JSON, SELECTORS.
- FastAPI (optional): shared executor lifespan, NDJSON streaming from
astream(), OpenAPI-friendly columnar bodies,register_exception_handlers(503 / 400 / 422). Start with the golden path and FastAPI guide. - Lazy SQL DataFrame (optional): install
pydantable[sql]forSqlDataFrame/SqlDataFrameModelwith the SQLAlchemy lazy-SQLExecutionEngine(pydantable-protocol). The goal is to keep transforms on the SQL side (plans compiled to SQL) instead of loading whole tables into Python—especially when you write results back to the same database. Guide: MOLTRES_SQL; protocol authors: Custom engine packages. - Mongo engine (optional, 1.17.0+):
pip install "pydantable[mongo]"— PyMongo, Beanie, and the Mongo plan stack for lazy frames. Define collections with BeanieDocumentmodels, thenMongoDataFrame.from_beanie/fetch_mongo(sync_pymongo_collection(...))(see MONGO_ENGINE). PydanticSchema+from_collectionremains supported if you use a rawCollection. Under the hood:MongoPydantableEngine(pydantable) andMongoRootfrom the plan stack.
Install
pip install pydantable
Common extras:
pip install "pydantable[polars]" # to_polars
pip install "pydantable[arrow]" # to_arrow / Arrow constructors
pip install "pydantable[io]" # full file I/O convenience (arrow + polars)
pip install "pydantable[sql]" # SQLModel + SQLAlchemy + moltres-core lazy SqlDataFrame; add a DB-API driver for your URL
pip install "pydantable[pandas]" # pandas-flavored façade (pandas UI doc)
pip install "pydantable[fastapi]" # FastAPI integration (pydantable.fastapi)
pip install "pydantable[mongo]" # pymongo + Beanie + Mongo plan stack (lazy MongoDataFrame + I/O + from_beanie)
Quick start
from pydantable import DataFrameModel
class User(DataFrameModel):
id: int
age: int | None
df = User({"id": [1, 2], "age": [20, None]})
result = (
df.with_columns(age2=df.age * 2)
.filter(df.age > 10)
.select("id", "age2")
)
print(result.to_dict())
print([r.model_dump() for r in result.collect()])
Output (exact values depend on filtering; this matches scripts/verify_doc_examples.py):
{'id': [1], 'age2': [40]}
[{'id': 1, 'age2': 40}]
Core concepts
| Piece | Role |
|---|---|
DataFrameModel |
Table class with annotated columns (class Orders(DataFrameModel): ...). |
DataFrame[Schema] |
Generic API over your own Pydantic BaseModel. |
SqlDataFrame / SqlDataFrameModel |
Same shapes with pydantable[sql] — the lazy-SQL bridge compiles plans to SQL so transforms can stay in the database (sql_config= / sql_engine=); prefer when you are not round-tripping full tables through Python (e.g. write back to the same DB). |
MongoDataFrame / MongoDataFrameModel |
Primary: pydantable[mongo] — Beanie Document + from_beanie / sync_pymongo_collection for I/O. Also: Pydantic Schema with from_collection(sync_collection) without wiring Beanie. Lazy execution uses MongoPydantableEngine and MongoRoot. See MONGO_ENGINE. |
Expr |
Typed expressions in with_columns, filter, etc. |
| Errors | Ingest issues such as column length mismatch raise ColumnLengthMismatchError (ValueError subclass) from pydantable.errors — map to HTTP 400 in FastAPI via register_exception_handlers. |
Static typing
- mypy: schema-evolving return types for many chains via the bundled mypy plugin (
pluginsinpyproject.toml). - Pyright / Pylance: use committed stubs under
typings/; for explicit targets,as_model(...)/try_as_model(...)/assert_model(...)and typed escape hatches likeagg_as_model(...)/rolling_agg_as_model(...). See TYPING.
Rich column types (Literal, ipaddress, WKB, Annotated, …) are covered in SUPPORTED_TYPES.
Materialization: collect() / rows() → row models; to_dict() → dict[str, list]; to_polars() / to_arrow() with matching extras.
I/O at a glance
DataFrameModel/DataFrame[Schema]: lazyread_*/aread_*,export_*,write_*, and SQLModel helpers (fetch_sqlmodel,write_sqlmodel, …). For eager column loads, importmaterialize_*,fetch_sqlmodel,iter_sqlmodel, … frompydantable(same entrypoints as the internalpydantable.iopackage) and passdict[str, list]into constructors.- SQL details: IO_SQL (recommended APIs,
*_raw, deprecations) and SQLMODEL_SQL_ROADMAP (phased migration). - Large files & NDJSON patterns: IO_JSON, IO_NDJSON, EXECUTION.
Validation controls
- Strict by default on constructors.
- Optional ingest controls:
trusted_mode,ignore_errors,on_validation_errors. - Missing optional fields:
fill_missing_optional(defaultTrue). - Validation presets:
validation_profile=...(or__pydantable__ = {"validation_profile": "..."}). - Per-column and nested strictness: STRICTNESS (field policies + profile defaults).
Documentation
| Topic | Link |
|---|---|
| Docs home | pydantable.readthedocs.io |
| Map of all pages | DOCS_MAP |
| Quickstart | QUICKSTART |
DataFrameModel |
DATAFRAMEMODEL |
| Typing (mypy vs Pyright) | TYPING |
| I/O overview | IO_OVERVIEW |
| SQL (SQLModel, raw string SQL) | IO_SQL · SQLMODEL_SQL_ROADMAP |
| Lazy SQL DataFrame | MOLTRES_SQL |
MongoDB (lazy MongoDataFrame + eager fetch_mongo) |
MONGO_ENGINE |
| Pandas-like API | PANDAS_UI |
| FastAPI path | GOLDEN_PATH_FASTAPI → FASTAPI → FASTAPI_ENHANCEMENTS |
| Service ergonomics (OpenAPI, aliases, redaction) | SERVICE_ERGONOMICS |
| Custom dtypes | CUSTOM_DTYPES |
| Strictness | STRICTNESS |
| Cookbooks | Cookbook index (FastAPI, lazy pipelines, JSON logs, …) |
| Example multi-router app | docs/examples/fastapi/service_layout/ in this repo |
| Test helpers | pydantable.testing.fastapi — see FASTAPI |
| Execution & async | EXECUTION · MATERIALIZATION |
| Behavioral contract | INTERFACE_CONTRACT |
| Troubleshooting | TROUBLESHOOTING |
| Versioning | VERSIONING |
| Changelog | CHANGELOG |
Development
Use a virtual environment at .venv in the repo root (the Makefile defaults to .venv/bin/python). Full contributor setup, Maturin/Rust builds, and release notes: DEVELOPER.
make check-full # ruff, ty, pyright, typing snippet tests, Sphinx, Rust
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pydantable-1.17.0.tar.gz.
File metadata
- Download URL: pydantable-1.17.0.tar.gz
- Upload date:
- Size: 213.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bf8db5a83160ea0d54623fdef7759ec2b1cfd1aee2961d208e1e2d85606f1604
|
|
| MD5 |
40afdee7b83df5c8736ba55a34ed6e61
|
|
| BLAKE2b-256 |
1f23c85a0c98f4e42e12d4884bb50657c5331eff281ef1b4dd6135e2ccc01930
|
File details
Details for the file pydantable-1.17.0-py3-none-any.whl.
File metadata
- Download URL: pydantable-1.17.0-py3-none-any.whl
- Upload date:
- Size: 244.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.0.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
37ae014f388dfaeb3aa2e7e2e2bfaee8869e8c7e6a89e7304b1886c439455427
|
|
| MD5 |
c5cd3dc8d3105b1b112edd3ee3423ad1
|
|
| BLAKE2b-256 |
45a675fbbc546f46cdc6688b8f566f03506940626f58b9c9b1fc579d6b637fe5
|