Convert Polars or Pandas DataFrames to lists of Pydantic models with schema inference
Project description
❄️ Articuno ❄️
Convert Polars or Pandas DataFrames to Pydantic models with schema inference — and generate clean Python class code.
✨ Features
- Infer Pydantic models dynamically from Polars or Pandas DataFrames
- Supports nested structs, optional fields, and common data types
- Supports PyArrow‑backed Pandas columns (e.g.,
int64[pyarrow],string[pyarrow]) - Optional force_optional flag to make all fields optional regardless of data
- Generate clean Python model code using datamodel‑code‑generator
- Lightweight, dependency‑flexible design
📦 Installation
Install the core package:
pip install articuno
Add optional dependencies as needed:
-
Polars support:
pip install articuno[polars]
-
Pandas support (with optional PyArrow support):
pip install articuno[pandas]
Or install all extras:
pip install articuno[polars,pandas]
🚀 Usage
🔍 Infer a Pydantic model from a DataFrame
from articuno import df_to_pydantic
import polars as pl
df = pl.DataFrame({
"id": [1, 2, 3],
"name": ["Alice", "Bob", "Charlie"],
"score": [95.5, 88.0, 92.3]
})
models = df_to_pydantic(df)
print(models[0])
Output:
id=1 name='Alice' score=95.5
🌟 Using PyArrow‑backed Pandas columns
import pandas as pd
df = pd.DataFrame({
"id": pd.Series([1, 2, 3], dtype="int64[pyarrow]"),
"name": pd.Series(["Alice", "Bob", "Charlie"], dtype="string[pyarrow]")
})
from articuno import infer_pydantic_model
Model = infer_pydantic_model(df, model_name="ArrowUser")
print(Model.schema_json(indent=2))
Output (abbreviated):
{
"title": "ArrowUser",
"type": "object",
"properties": {
"id": { "title": "Id", "type": "integer" },
"name": { "title": "Name", "type": "string" }
},
"required": ["id", "name"]
}
🔥 Force all fields to be optional
Model = infer_pydantic_model(df, model_name="MyOptionalModel", force_optional=True)
🧾 Generate Python class code from a Pydantic model
from articuno.codegen import generate_class_code
code = generate_class_code(Model)
print(code)
Output (example):
from pydantic import BaseModel
class ArrowUser(BaseModel):
id: Optional[int] = None
name: Optional[str] = None
📜 Patito vs Articuno
| Feature | 🦜 Patito | ❄️ Articuno |
|---|---|---|
| Polars–Pydantic bridge | ✅ Declarative schema | ✅ Dynamic inference |
| Validation constraints | ✅ Unique, bounds | ⚠️ Basic types, nullables |
| Nested Structs | ❌ Not supported | ✅ Fully recursive |
| Code generation | ❌ | ✅ via datamodel‑code‑gen |
| Example/mock data | ✅ .examples |
❌ |
| Direct Pandas/Polars support | ❌ Indirect via dicts | ✅ Native support with inference |
Patito is ideal for static schema validation with custom constraints and ETL pipelines.
Articuno excels at dynamic schema inference, nested model generation, and code export for API use cases.
⚙️ Supported Type Mappings
| Polars Type | Pandas Type (incl. PyArrow) | Pydantic Type |
|---|---|---|
pl.Int*, pl.UInt* |
int64, int32, Int64 (nullable int), int64[pyarrow], int32[pyarrow] |
int |
pl.Float* |
float64, float32, float64[pyarrow], float32[pyarrow] |
float |
pl.Utf8 |
object (string), string[pyarrow] |
str |
pl.Boolean |
bool, boolean, bool[pyarrow] |
bool |
pl.Date |
datetime64[ns] (date only) |
datetime.date |
pl.Datetime |
datetime64[ns] (timestamp) |
datetime.datetime |
pl.Duration |
timedelta64[ns] |
datetime.timedelta |
pl.List |
list, object with lists |
List[...] |
pl.Struct |
dict, object with nested dicts |
Nested Pydantic model |
pl.Null |
NaN, None (nullable fields) |
Optional[...] |
⚡ Force Optional Mode
If you want to enforce that all fields (top-level and nested) are optional, use:
Model = infer_pydantic_model(df, force_optional=True)
Or:
models = df_to_pydantic(df, force_optional=True)
🛠️ Development
To install development dependencies:
pip install articuno[dev]
Run tests with:
pytest
🔗 Links
📄 License
MIT © Odos Matthews
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file articuno-0.6.3.tar.gz.
File metadata
- Download URL: articuno-0.6.3.tar.gz
- Upload date:
- Size: 9.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
35766553676bff693520a77ccc770f77a28d3ec80cbb8a8f7a9e27495362f4ee
|
|
| MD5 |
682bb6647258b3c377705eac8a3b0252
|
|
| BLAKE2b-256 |
d8f259a65f08e9347102587a2848efbddae27ac932638244fc3451ae97bea4f9
|
File details
Details for the file articuno-0.6.3-py3-none-any.whl.
File metadata
- Download URL: articuno-0.6.3-py3-none-any.whl
- Upload date:
- Size: 9.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7572e5ff60279356a1b360af4a091b5a5a3a339c36cf9429215cb46a83cecd77
|
|
| MD5 |
560c4106f6f3c66273b24330c787ad10
|
|
| BLAKE2b-256 |
3ef1c6c8751768811c45526491f6078345f5c51c8936fda97b9dd48b99295ac6
|