Convert Polars or Pandas DataFrames to lists of Pydantic models with schema inference
Project description
❄️ Articuno ❄️
Convert Polars or Pandas DataFrames to Pydantic models with schema inference — and generate clean Python class code.
✨ Features
- Infer Pydantic models dynamically from Polars or Pandas DataFrames
- Infer Pydantic models and instances directly from iterables of dictionaries
- Supports nested structs, optional fields, and common data types
- Supports PyArrow-backed Pandas columns (e.g.,
int64[pyarrow],string[pyarrow]) - Optional force_optional flag to make all fields optional regardless of data
- Configurable max_scan parameter to limit schema inference to the first N records of an iterable
- Generate clean Python model code using datamodel-code-generator
- Lightweight, dependency-flexible design
📦 Installation
Install the core package:
pip install articuno
Add optional dependencies as needed:
- Polars support:
pip install articuno[polars]
- Pandas support (with optional PyArrow support):
pip install articuno[pandas]
- Full install:
pip install articuno[polars,pandas]
🚀 Usage
🔍 DataFrame-based Inference
Infer models from Polars or Pandas DataFrames:
from articuno import df_to_pydantic
import polars as pl
df = pl.DataFrame({
"id": [1, 2, 3],
"name": ["Alice", "Bob", "Charlie"],
"score": [95.5, 88.0, 92.3]
})
instances = df_to_pydantic(df, model_name="UserModel")
print(instances[0]) # id=1 name='Alice' score=95.5
Or just get the model class:
from articuno import infer_pydantic_model
Model = infer_pydantic_model(df, model_name="UserModel")
print(Model.schema_json(indent=2))
🧰 Iterable-of-Dicts Inference
Infer schemas and instantiate models directly from iterables of dict (e.g., SQL query results, JSON records):
from articuno import (
df_to_pydantic,
infer_pydantic_model,
dicts_to_pydantic,
infer_generic_model,
)
# Sample records
dicts = [
{"id": 1, "value": "foo"},
{"id": 2, "value": "bar"},
# ...
]
# Convert to Pydantic instances (scans first 1000 by default)
instances = df_to_pydantic(dicts)
for obj in instances:
print(obj)
# Get model class only with custom name/scan limit
ModelClass = infer_pydantic_model(
dicts,
model_name="RecModel",
max_scan=500
)
print(ModelClass.schema_json(indent=2))
# Lazy generator of instances
for obj in dicts_to_pydantic(dicts, max_scan=200):
print(obj)
# Generic model inference
GenericModel = infer_generic_model(dicts, model_name="GenModel")
🌟 PyArrow-backed Pandas Columns
import pandas as pd
from articuno import infer_pydantic_model
df = pd.DataFrame({
"id": pd.Series([1,2,3], dtype="int64[pyarrow]"),
"name": pd.Series(["A","B","C"], dtype="string[pyarrow]")
})
Model = infer_pydantic_model(df, model_name="ArrowUser")
print(Model.schema_json(indent=2))
🔥 Force Optional Fields
from articuno import infer_pydantic_model, df_to_pydantic
Model = infer_pydantic_model(df, force_optional=True)
models = df_to_pydantic(df, force_optional=True)
🧾 Generate Code
from articuno.codegen import generate_class_code
code = generate_class_code(Model)
print(code)
⚙️ Supported Type Mappings
| Polars Type | Pandas Type (incl. PyArrow) | Pydantic Type |
|---|---|---|
pl.Int*, pl.UInt* |
int64, Int64, int64[pyarrow] |
int |
pl.Float* |
float64, float64[pyarrow] |
float |
pl.Utf8 |
object, string[pyarrow] |
str |
pl.Boolean |
bool, bool[pyarrow] |
bool |
pl.Date |
datetime64[ns] |
datetime.date |
pl.Datetime |
datetime64[ns] |
datetime.datetime |
pl.Duration |
timedelta64[ns] |
datetime.timedelta |
pl.List |
list |
List[...] |
pl.Struct |
dict |
Nested model |
pl.Null |
None, NaN |
Optional[...] |
🛠️ Development
pip install articuno[dev]
pytest
🔗 Links
📄 License
MIT © Odos Matthews
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file articuno-0.7.0.tar.gz.
File metadata
- Download URL: articuno-0.7.0.tar.gz
- Upload date:
- Size: 11.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
00c548885d0fc8563c61ad3d335fcc738f9b46ae4b4a0401fc582d5c4b8b9794
|
|
| MD5 |
aa41356c44c50d370f50f380f009e609
|
|
| BLAKE2b-256 |
cac624680f53dc93d5091fd1c0ff904efe25a9d7c20f7010ade3a785bfe77211
|
File details
Details for the file articuno-0.7.0-py3-none-any.whl.
File metadata
- Download URL: articuno-0.7.0-py3-none-any.whl
- Upload date:
- Size: 12.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a276045d9025488037d52d2e957569cd27945c2793288da07282fe51fcdcf1c9
|
|
| MD5 |
90e9057824dd8bbc574638add2f0b7c6
|
|
| BLAKE2b-256 |
40143cb8f5991fca93e651ac5b1e72a4366303a71f9b36c296948e7dbb0a9de5
|