Skip to main content

Convert Polars or Pandas DataFrames to lists of Pydantic models with schema inference

Project description

❄️ Articuno ❄️

Convert Polars or Pandas DataFrames to Pydantic models with schema inference — and generate clean Python class code.


✨ Features

  • Infer Pydantic models dynamically from Polars or Pandas DataFrames
  • Supports nested structs, optional fields, and common data types
  • Supports PyArrow‑backed Pandas columns (e.g., int64[pyarrow], string[pyarrow])
  • Optional force_optional flag to make all fields optional regardless of data
  • Generate clean Python model code using datamodel‑code‑generator
  • Lightweight, dependency‑flexible design

📦 Installation

Install the core package:

pip install articuno

Add optional dependencies as needed:

  • Polars support:

    pip install articuno[polars]
    
  • Pandas support (with optional PyArrow support):

    pip install articuno[pandas]
    

Or install all extras:

pip install articuno[polars,pandas]

🚀 Usage

🔍 Infer a Pydantic model from a DataFrame

from articuno import df_to_pydantic
import polars as pl

df = pl.DataFrame({
    "id": [1, 2, 3],
    "name": ["Alice", "Bob", "Charlie"],
    "score": [95.5, 88.0, 92.3]
})

models = df_to_pydantic(df)
print(models[0])

Output:

id=1 name='Alice' score=95.5

🌟 Using PyArrow‑backed Pandas columns

import pandas as pd

df = pd.DataFrame({
    "id": pd.Series([1, 2, 3], dtype="int64[pyarrow]"),
    "name": pd.Series(["Alice", "Bob", "Charlie"], dtype="string[pyarrow]")
})

from articuno import infer_pydantic_model

Model = infer_pydantic_model(df, model_name="ArrowUser")
print(Model.schema_json(indent=2))

Output (abbreviated):

{
  "title": "ArrowUser",
  "type": "object",
  "properties": {
    "id": { "title": "Id", "type": "integer" },
    "name": { "title": "Name", "type": "string" }
  },
  "required": ["id", "name"]
}

🔥 Force all fields to be optional

Model = infer_pydantic_model(df, model_name="MyOptionalModel", force_optional=True)

🧾 Generate Python class code from a Pydantic model

from articuno.codegen import generate_class_code

code = generate_class_code(Model)
print(code)

Output (example):

from pydantic import BaseModel

class ArrowUser(BaseModel):
    id: Optional[int] = None
    name: Optional[str] = None

📜 Patito vs Articuno

Feature 🦜 Patito ❄️ Articuno
Polars–Pydantic bridge ✅ Declarative schema ✅ Dynamic inference
Validation constraints ✅ Unique, bounds ⚠️ Basic types, nullables
Nested Structs ❌ Not supported ✅ Fully recursive
Code generation ✅ via datamodel‑code‑gen
Example/mock data .examples
Direct Pandas/Polars support ❌ Indirect via dicts ✅ Native support with inference

Patito is ideal for static schema validation with custom constraints and ETL pipelines.

Articuno excels at dynamic schema inference, nested model generation, and code export for API use cases.


⚙️ Supported Type Mappings

Polars Type Pandas Type (incl. PyArrow) Pydantic Type
pl.Int*, pl.UInt* int64, int32, Int64 (nullable int), int64[pyarrow], int32[pyarrow] int
pl.Float* float64, float32, float64[pyarrow], float32[pyarrow] float
pl.Utf8 object (string), string[pyarrow] str
pl.Boolean bool, boolean, bool[pyarrow] bool
pl.Date datetime64[ns] (date only) datetime.date
pl.Datetime datetime64[ns] (timestamp) datetime.datetime
pl.Duration timedelta64[ns] datetime.timedelta
pl.List list, object with lists List[...]
pl.Struct dict, object with nested dicts Nested Pydantic model
pl.Null NaN, None (nullable fields) Optional[...]

⚡ Force Optional Mode

If you want to enforce that all fields (top-level and nested) are optional, use:

Model = infer_pydantic_model(df, force_optional=True)

Or:

models = df_to_pydantic(df, force_optional=True)

🛠️ Development

To install development dependencies:

pip install articuno[dev]

Run tests with:

pytest

🔗 Links


📄 License

MIT © Odos Matthews

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

articuno-0.6.3.tar.gz (9.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

articuno-0.6.3-py3-none-any.whl (9.0 kB view details)

Uploaded Python 3

File details

Details for the file articuno-0.6.3.tar.gz.

File metadata

  • Download URL: articuno-0.6.3.tar.gz
  • Upload date:
  • Size: 9.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.6

File hashes

Hashes for articuno-0.6.3.tar.gz
Algorithm Hash digest
SHA256 35766553676bff693520a77ccc770f77a28d3ec80cbb8a8f7a9e27495362f4ee
MD5 682bb6647258b3c377705eac8a3b0252
BLAKE2b-256 d8f259a65f08e9347102587a2848efbddae27ac932638244fc3451ae97bea4f9

See more details on using hashes here.

File details

Details for the file articuno-0.6.3-py3-none-any.whl.

File metadata

  • Download URL: articuno-0.6.3-py3-none-any.whl
  • Upload date:
  • Size: 9.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.6

File hashes

Hashes for articuno-0.6.3-py3-none-any.whl
Algorithm Hash digest
SHA256 7572e5ff60279356a1b360af4a091b5a5a3a339c36cf9429215cb46a83cecd77
MD5 560c4106f6f3c66273b24330c787ad10
BLAKE2b-256 3ef1c6c8751768811c45526491f6078345f5c51c8936fda97b9dd48b99295ac6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page