Convert Polars DataFrames to lists of Pydantic models with schema inference

Project description

❄️ Articuno ❄️

Convert Polars DataFrames to Pydantic models — and optionally generate clean Python code from them.

A blazing-fast tool for schema inference, data validation, and model generation powered by Polars and Pydantic.

🚀 Features

🔍 Infer Pydantic models directly from polars.DataFrame schemas
🧪 Validate data by converting DataFrame rows to Pydantic instances
🧱 Supports nested Structs, Lists, Nullable fields, and advanced types
🧬 Generate Python model code from dynamic models using datamodel-code-generator

📦 Installation

pip install articuno

🛠 Usage

1. Convert a DataFrame to Pydantic Models

import polars as pl
from articuno import df_to_pydantic

df = pl.DataFrame({
    "name": ["Alice", "Bob"],
    "age": [30, 25],
    "is_active": [True, False],
})

models = df_to_pydantic(df)

print(models[0])
print(models[0].dict())

Output:

name='Alice' age=30 is_active=True
{'name': 'Alice', 'age': 30, 'is_active': True}

2. Infer a Model Only

from articuno import infer_pydantic_model

model = infer_pydantic_model(df, model_name="UserModel")
print(model.schema_json(indent=2))

Output (snippet):

{
  "title": "UserModel",
  "type": "object",
  "properties": {
    "name": { "title": "Name", "type": "string" },
    "age": { "title": "Age", "type": "integer" },
    "is_active": { "title": "Is Active", "type": "boolean" }
  },
  "required": ["name", "age", "is_active"]
}

3. Generate Python Source Code from a Model

from articuno import generate_pydantic_class_code

code = generate_pydantic_class_code(model, model_name="UserModel")
print(code)

Output:

from pydantic import BaseModel

class UserModel(BaseModel):
    name: str
    age: int
    is_active: bool

Or write it to a file:

generate_pydantic_class_code(model, output_path="user_model.py")

🧬 Example: Nested Structs

nested_df = pl.DataFrame({
    "user": pl.Series([
        {"name": "Alice", "age": 30},
        {"name": "Bob", "age": 25},
    ], dtype=pl.Struct([
        ("name", pl.Utf8),
        ("age", pl.Int64),
    ]))
})

models = df_to_pydantic(nested_df)
print(models[0])
print(models[0].user.name)

Output:

AutoModel_user_Struct(name='Alice', age=30)
Alice

𞧯 When to Use Articuno

✅ You use Polars and want type-safe modeling
✅ You dynamically load or transform tabular data
✅ You want to generate sharable Python classes
✅ You want to validate Polars DataFrames using Pydantic rules

⚙️ Supported Type Mappings

Polars Type	Pydantic Type
`pl.Int`, `pl.UInt`	`int`
`pl.Float*`	`float`
`pl.Utf8`	`str`
`pl.Boolean`	`bool`
`pl.Date`	`datetime.date`
`pl.Datetime`	`datetime.datetime`
`pl.Duration`	`datetime.timedelta`
`pl.List`	`List[...]`
`pl.Struct`	Nested Pydantic model
`pl.Null`	`Optional[...]`

🧩 Integration Ideas

🔐 Use for FastAPI or Litestar API schemas
🧼 Use in ETL pipelines to enforce schema contracts
📄 Use to generate Pydantic models from data exports
🔀 Use with polars.read_json / read_parquet to auto-model nested data

🧪 Development & Testing

git clone https://github.com/your-username/articuno
cd articuno
pip install -e ".[dev]"
pytest

🧙‍♂️ FastAPI Integration (Decorator + CLI Bootstrap)

Articuno makes it easy to generate response_models for your FastAPI endpoints that return polars.DataFrames — no need to manually define Pydantic models.

🧩 Step 1: Add the Decorator

Use the @infer_response_model decorator on your FastAPI endpoint. Provide:

a name for the generated Pydantic model,

an example input dict to simulate a call to your endpoint,

an optional path to your models.py file (defaults to models.py next to the FastAPI app file).

from fastapi import FastAPI
from articuno.decorator import infer_response_model
import polars as pl

app = FastAPI()

@infer_response_model(
    name="UserModel",
    example_input={"limit": 2},
    models_path="models.py"  # Optional, relative to this file by default
)
@app.get("/users")
def get_users(limit: int):
    return pl.DataFrame({
        "name": ["Alice", "Bob"],
        "age": [30, 25],
    }).head(limit)

📝 The decorator doesn't change behavior at runtime — it simply registers this endpoint for the CLI to analyze later.

⚙️ Step 2: Run the CLI Bootstrap

After writing or modifying your endpoints, run the Articuno CLI:

python cli.py bootstrap app/main.py

This will:

Import and call all decorated endpoints with the given example_input
Infer a Pydantic model from the returned polars.DataFrame
Write the model to the specified models.py file
Update your FastAPI app:
- Add response_model=YourModel to the route decorator
- Import the model at the top
- Remove the @infer_response_model(...) decorator

🎯 Example Result (After Bootstrapping)

Before CLI:

@infer_response_model(name="UserModel", example_input={"limit": 2})
@app.get("/users")
def get_users(limit: int):
    ...

After CLI:

from models import UserModel

@app.get("/users", response_model=UserModel)
def get_users(limit: int):
    ...

models.py will contain:

from pydantic import BaseModel

class UserModel(BaseModel):
    name: str
    age: int

🛠 CLI Options

Usage: cli.py bootstrap [OPTIONS] APP_PATH

Arguments:
  APP_PATH                Path to your FastAPI app file (e.g., app/main.py)

Options:
  --models-path PATH      Optional output path for models.py (defaults to same folder as app)
  --dry-run               Preview changes without writing files
  --help                  Show this message and exit

📜 Patito vs Articuno

Feature	Patito	Articuno
Polars–Pydantic bridge	✅ Declarative schema	✅ Dynamic inference
Validation constraints	✅ Unique, bounds	⚠️ Basic types, nullables
Nested Structs	❌ Not supported	✅ Fully recursive
Code generation	❌	✅ via datamodel-code-gen
Example/mock data	✅ `.examples`	❌

Patito is ideal for static schema validation with custom constraints and ETL pipelines.

Articuno excels at dynamic schema inference, nested model generation, and code export for API use cases.

🎜️ License

Project details

Release history Release notifications | RSS feed

0.10.0

Dec 5, 2025

0.9.0

Oct 21, 2025

0.8.0

Jul 25, 2025

0.7.0

Jul 8, 2025

0.6.3

Jul 7, 2025

0.6.2

Jul 6, 2025

0.6.1

Jul 6, 2025

0.6.0

Jul 6, 2025

0.5.0

Jul 6, 2025

0.4.9

Jul 5, 2025

0.4.8

Jul 5, 2025

0.4.7

Jul 5, 2025

0.4.6

Jul 5, 2025

0.4.5

Jul 5, 2025

0.4.4

Jul 5, 2025

0.4.3

Jul 5, 2025

0.4.2

Jul 5, 2025

0.4.1

Jul 5, 2025

0.4.0

Jul 4, 2025

0.3.12

Jul 4, 2025

0.3.11

Jul 4, 2025

0.3.10

Jul 4, 2025

0.3.9

Jul 4, 2025

0.3.8

Jul 4, 2025

0.3.7

Jun 30, 2025

0.3.6

Jun 30, 2025

0.3.5

Jun 30, 2025

0.3.4

Jun 30, 2025

0.3.3

Jun 30, 2025

0.3.2

Jun 30, 2025

0.3.1

Jun 30, 2025

This version

0.3.0

Jun 29, 2025

0.2.0

Jun 29, 2025

0.1.0

Jun 28, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

articuno-0.3.0.tar.gz (11.2 kB view details)

Uploaded Jun 29, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

articuno-0.3.0-py3-none-any.whl (9.0 kB view details)

Uploaded Jun 29, 2025 Python 3

File details

Details for the file articuno-0.3.0.tar.gz.

File metadata

Download URL: articuno-0.3.0.tar.gz
Upload date: Jun 29, 2025
Size: 11.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.7

File hashes

Hashes for articuno-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`0520641e52fd043d964607f5d463f4bc00cd8ad2f3812a6f644b789ed5079074`
MD5	`16efc59af15eed8cdeb3c0bb55842fd4`
BLAKE2b-256	`342f2bd6306bd628637a7b0b7461d1110e45b1a021366907fb002f348e3bc8af`

See more details on using hashes here.

File details

Details for the file articuno-0.3.0-py3-none-any.whl.

File metadata

Download URL: articuno-0.3.0-py3-none-any.whl
Upload date: Jun 29, 2025
Size: 9.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.7

File hashes

Hashes for articuno-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3aec363af62369cc7497673322de87632e805b0867097632dd02cd77a1720e8c`
MD5	`7b4864292c62031d5559bcd94ac35c8a`
BLAKE2b-256	`6dd529616accd5d1ee283ab1e62dd2707de2f33e01a820436f5ad142e79af48c`

See more details on using hashes here.

articuno 0.3.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

❄️ Articuno ❄️

🚀 Features

📦 Installation

🛠 Usage

1. Convert a DataFrame to Pydantic Models

2. Infer a Model Only

3. Generate Python Source Code from a Model

🧬 Example: Nested Structs

𞧯 When to Use Articuno

⚙️ Supported Type Mappings

🧩 Integration Ideas

🧪 Development & Testing

🧙‍♂️ FastAPI Integration (Decorator + CLI Bootstrap)

🧩 Step 1: Add the Decorator

⚙️ Step 2: Run the CLI Bootstrap

🎯 Example Result (After Bootstrapping)

🛠 CLI Options

📜 Patito vs Articuno

🎜️ License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes