Skip to main content

Rust-backed schema inference from Sequence[Mapping] with Pydantic model emission

Project description

infermodel

Rust-backed schema inference from Sequence[Mapping] data (e.g. list of dicts) with Pydantic model emission.

Infer a schema from a sequence of mappings (e.g. list or tuple of dicts), then convert that inferred schema into a Pydantic model on the Python side—without hardcoding schema logic in application code.

Features

  • Rust core: Performance-critical traversal, merge logic, and required/nullable tracking
  • Python ergonomics: Pydantic v2 model creation via infer_model(...)
  • Conservative defaults: Strings stay strings; int+float promotes to float; incompatible mixes become Any
  • Required vs nullable: Tracks presence (missing key → optional) and explicit None (nullable) separately

Installation

From the project root (with a virtualenv activated):

pip install -e .
# or: maturin develop

Requirements: Python 3.10+, Pydantic v2. Build requires Rust (for the extension).

Quick start

from infermodel import infer_schema, infer_model

data = [
    {"id": 1, "name": "Alice"},
    {"id": 2, "name": None},
    {"id": 3},  # missing "name" -> optional
]

# Get introspectable schema dict
schema = infer_schema(data)
# {"type": "model", "fields": {"id": {"type": "int", "required": True, "nullable": False}, ...}}

# Get a Pydantic model class
Model = infer_model(data, model_name="Record")
instance = Model(id=1, name="Alice")

API

  • infer_schema(data, config=None)
    Returns a nested dict with type, fields, and per-field type, required, nullable.

  • infer_model(data, model_name="InferredModel", config=None)
    Infers the schema and returns a dynamic Pydantic model class.

  • InferConfig
    Dataclass for policy options (e.g. incompatible_scalar_policy, string_date_policy). V1 uses built-in policies only.

  • model_from_schema(schema, model_name="InferredModel")
    Build a Pydantic model from an existing schema dict (e.g. from infer_schema).

Required vs nullable

  • Required: field present in every row.
  • Optional: field missing in at least one row.
  • Nullable: at least one row had explicit None for that field.

These are independent: a field can be required and nullable, or optional and not nullable.

Development

# Create venv and install in editable mode with Rust extension
python -m venv .venv
source .venv/bin/activate  # or .venv\Scripts\activate on Windows
maturin develop

# Run tests
pytest

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

infermodel-0.1.0-cp312-cp312-macosx_11_0_arm64.whl (239.0 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

File details

Details for the file infermodel-0.1.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for infermodel-0.1.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d7ba8f66aed8f97e358469f445371fa212e5924be23f0e79eade927dcab07033
MD5 8e3adca94514d827013c208c6f21d8d6
BLAKE2b-256 9ab352fb0c763cf28adf41b895ab72700d55a64913ea1783c53dcb67c1e0d4f0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page