Infer JSON schemas from sample data
Project description
philiprehberger-schema-infer
Infer JSON schemas from sample data.
Installation
pip install philiprehberger-schema-infer
Usage
Infer schema from samples
from philiprehberger_schema_infer import infer
samples = [
{"name": "Alice", "age": 30, "active": True},
{"name": "Bob", "age": 25, "email": "bob@test.com"},
]
schema = infer(samples)
# {
# "type": "object",
# "properties": {
# "name": {"type": "string", "minLength": 3, "maxLength": 5},
# "age": {"type": "integer", "minimum": 25, "maximum": 30},
# "active": {"type": "boolean"},
# "email": {"type": "string", "format": "email", ...}
# },
# "required": ["age", "name"]
# }
Full JSON Schema output
from philiprehberger_schema_infer import to_json_schema
schema = to_json_schema(samples)
# {
# "$schema": "https://json-schema.org/draft/2020-12/schema",
# "type": "object",
# "properties": { ... },
# "required": [...]
# }
Single value type inference
from philiprehberger_schema_infer import infer_type
infer_type([1, 2, 3])
# {"type": "array", "items": {"type": "integer"}}
Schema strictness levels
Control how aggressively fields are marked required and constraints are applied:
from philiprehberger_schema_infer import infer
# Loose: no required fields, no numeric/string constraints
schema = infer(samples, strictness="loose")
# Normal (default): fields in all samples are required, constraints included
schema = infer(samples, strictness="normal")
# Strict: all fields required, additionalProperties set to False
schema = infer(samples, strictness="strict")
Custom format detection
Register domain-specific regex patterns for format detection:
from philiprehberger_schema_infer import register_format, infer_type
register_format("phone", r"^\+\d{1,3}-\d{3,14}$")
register_format("credit-card", r"^\d{4}-\d{4}-\d{4}-\d{4}$")
infer_type("+1-5551234567")
# {"type": "string", "format": "phone"}
Merge schemas
Combine multiple inferred schemas with union/intersection logic for required fields:
from philiprehberger_schema_infer import merge_schemas
merged = merge_schemas(schema_a, schema_b, schema_c)
API
| Function | Description |
|---|---|
infer(samples, *, strictness="normal") |
Infer JSON Schema from a list of dicts. Supports "loose", "normal", and "strict" levels. |
infer_type(value) |
Infer schema type for a single value |
merge_schemas(*schemas) |
Merge two or more schemas into one accepting any of them |
register_format(name, pattern) |
Register a custom regex pattern for string format detection |
to_json_schema(samples, *, strictness="normal") |
Wraps infer() output with $schema URI for draft 2020-12 |
Development
pip install -e .
python -m pytest tests/ -v
Support
If you find this project useful:
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file philiprehberger_schema_infer-0.3.1.tar.gz.
File metadata
- Download URL: philiprehberger_schema_infer-0.3.1.tar.gz
- Upload date:
- Size: 8.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0809cebbbca98b23301b54e04af8662884a7dbe95cfc253fc1c3a5935ddc6bb0
|
|
| MD5 |
4ca13eedfa9a48ba213cfc88e2a3d216
|
|
| BLAKE2b-256 |
4240ec4e7f0244b659bb0f771c6cdc766f18fcbe8027d914173228d01e6cc08c
|
File details
Details for the file philiprehberger_schema_infer-0.3.1-py3-none-any.whl.
File metadata
- Download URL: philiprehberger_schema_infer-0.3.1-py3-none-any.whl
- Upload date:
- Size: 6.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.13
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e1b73ddbea4ad5455e83aa4d7eed86543a45502aff0320279073c8836ea5a2f5
|
|
| MD5 |
16e3b2741460f00614f82462eb0586e9
|
|
| BLAKE2b-256 |
5f6e80f21c5acfa407f7aa0a6cd4c4d47214a21c1ec4391b46740bb6a18eeaf7
|