A library for comparing JSON and schemas and generate new
Project description
๐ genschema
A powerful, intelligent library for generating JSON Schema from multiple JSON instances with smart merging, advanced inference, and modular refinements.
โญ Star us on GitHub | ๐ Read the Docs | ๐ Report Bug
โจ Features
- ๐ฏ Intelligent Merging โ Combines multiple JSON instances into a single schema
- ๐ Configurable Combinators โ Use
anyOforoneOffor conflicting types/properties - ๐ง Advanced Inference โ Automatic format detection (email, uuid, date-time, etc.)
- ๐ Required & Empty Handling โ Smart inference of
required,minProperties,minItems, etc. - ๐ Pseudo-Array Detection โ Treats inhomogeneous arrays as object-like structures when needed
- โก Modular Pipeline โ Chain of configurable comparators for full control
- ๐ ๏ธ CLI & Python API โ Flexible usage from command line or code
- ๐ Rich Output โ Colored console feedback with timing and instance count
๐ Quick Start
Installation
pip install genschema
30-Second Python Example
from genschema import Converter, PseudoArrayHandler
from genschema.comparators import (
FormatComparator,
RequiredComparator,
EmptyComparator,
DeleteElement,
)
conv = Converter(
pseudo_handler=PseudoArrayHandler(),
base_of="anyOf", # or "oneOf"
)
# Add JSON data (files, dicts, or existing schemas)
conv.add_json("example1.json")
conv.add_json("example2.json")
conv.add_json({"name": "Alice", "email": "alice@example.com"})
# Register optional refinements
conv.register(FormatComparator())
conv.register(RequiredComparator())
conv.register(EmptyComparator())
conv.register(DeleteElement())
conv.register(DeleteElement("isPseudoArray"))
# Generate schema
result = conv.run()
print(result) # Pretty-printed JSON Schema
CLI Usage
# Basic: single or multiple files
genschema input1.json input2.json -o schema.json
# Use oneOf instead of anyOf
genschema *.json --base-of oneOf -o schema.json
# Disable refinements
genschema data.json --no-format --no-required --no-pseudo-array
# Read from stdin
cat data.json | genschema - -o schema.json
๐ Comparison with GenSON
| Feature | genschema | GenSON |
|---|---|---|
| Multiple Instance Merging | Yes | Yes |
| Variant Type Handling | Configurable anyOf or oneOf |
anyOf only |
| Format Inference | Yes (email, date-time, uuid, uri, etc.) | No |
| Required Properties | Configurable inference | Yes (present in all objects) |
| Empty/Min-Max Handling | Yes (minProperties, minItems, etc.) |
Limited |
| Pseudo-Array Detection | Yes | No |
| Modular Extensions | Comparator pipeline (easy to add/remove) | SchemaStrategy subclasses |
| CLI Support | Full-featured with rich output | Basic (genson) |
| Performance (avg. benchmark) | ~2.1ร slower | Faster |
Note: Performance measured on static datasets of varying complexity. genschema prioritizes richer inference and flexibility over raw speed.
๐๏ธ Architecture
Modular pipeline design for clean, extensible code:
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ Input JSONs โ โ Input Schemas โ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ โ
โโโโโโโโโโโโฌโโโโโโโโโโโโโ
โผ
โโโโโโโโโโโโโโโโโ
โ Pipeline Run โ
โโโโโโโโโโโโโโโโโ
โผ
โโโโโโโโโโโโโโโโโโโโโ
โ Process Layer โโโโโโโโ
โโโโโโโโโโโโโโโโโโโโโ โ
โ โ
โผ โ
โโโโโโโโโโโโโโโโโโโโโโโ โ
โ Comparators Chain โโโโโโโ
โโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโ
โ Result โ
โโโโโโโโโโโโโโโโโ
๐ ๏ธ Development
Setup
git clone https://github.com/Miskler/genschema.git
cd genschema
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -e ".[dev]" # or make install-dev if Makefile exists
Common Commands
make test # Run tests with coverage
make lint # Lint code
make type-check # mypy checking
make format # Format with black
make docs # Build documentation
๐ Documentation
๐ค Contributing
We welcome contributions!
Fork the repository, create a feature branch, and submit a pull request.
Ensure tests pass and code follows black/mypy style.
make test
make lint
make type-check
๐ License
AGPL-3.0 License โ see LICENSE file for details.
Made with โค๏ธ for developers working with evolving JSON data
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file genschema-0.1.3.tar.gz.
File metadata
- Download URL: genschema-0.1.3.tar.gz
- Upload date:
- Size: 53.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
79413b66fb28dcca0e2ab3bc60de412436d984ffec4e15d021f7ada0ab380ab3
|
|
| MD5 |
0b55d99df246c2c79ee93adccd10cd51
|
|
| BLAKE2b-256 |
531594f4c2b6e2b854391fb0f2544f2fd16f0a72e132e7de96270bdc76d3c6ee
|
Provenance
The following attestation bundles were made for genschema-0.1.3.tar.gz:
Publisher:
release.yml on Miskler/genschema
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
genschema-0.1.3.tar.gz -
Subject digest:
79413b66fb28dcca0e2ab3bc60de412436d984ffec4e15d021f7ada0ab380ab3 - Sigstore transparency entry: 971661318
- Sigstore integration time:
-
Permalink:
Miskler/genschema@829518a2ff3dcd2509d6f7d15bf61d99789b8b6b -
Branch / Tag:
refs/heads/main - Owner: https://github.com/Miskler
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@829518a2ff3dcd2509d6f7d15bf61d99789b8b6b -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file genschema-0.1.3-py3-none-any.whl.
File metadata
- Download URL: genschema-0.1.3-py3-none-any.whl
- Upload date:
- Size: 42.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
39a4c3d1160cc1b669ce4e3272e626e51016b65083ad04d0230fb258f4990f0e
|
|
| MD5 |
8dc49ee05c59e19c1e2de45be71bb933
|
|
| BLAKE2b-256 |
be457003e8749254881ef7162426d1ce8739979f887539af39750fca07b02db1
|
Provenance
The following attestation bundles were made for genschema-0.1.3-py3-none-any.whl:
Publisher:
release.yml on Miskler/genschema
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
genschema-0.1.3-py3-none-any.whl -
Subject digest:
39a4c3d1160cc1b669ce4e3272e626e51016b65083ad04d0230fb258f4990f0e - Sigstore transparency entry: 971661321
- Sigstore integration time:
-
Permalink:
Miskler/genschema@829518a2ff3dcd2509d6f7d15bf61d99789b8b6b -
Branch / Tag:
refs/heads/main - Owner: https://github.com/Miskler
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@829518a2ff3dcd2509d6f7d15bf61d99789b8b6b -
Trigger Event:
workflow_dispatch
-
Statement type: