Infer schema from JSON or CSV data — output as TypeScript, Pydantic, or Markdown
Project description
sniff-schema
Infer schema from any JSON or CSV — instantly output as TypeScript, Pydantic, or Markdown.
You get a JSON file or CSV from an API, a colleague, or a legacy system. No docs. No types. Just raw data. sniff-schema reads it and tells you exactly what's in it — field names, types, nullability, null percentages, and sample values — then outputs a ready-to-use TypeScript interface or Pydantic model.
No LLM. No cloud. No auth. Pure local analysis.
Install
pip install sniff-schema
Or run directly without installing:
pip install httpx "rich>=13" "typer>=0.12"
python sniff_schema.py data.json
Usage
# Rich terminal table (default)
sniff-schema data.json
# TypeScript interface
sniff-schema data.json --format typescript
# Pydantic v2 model
sniff-schema data.json --format pydantic
# Markdown table — save to file
sniff-schema data.json --format markdown --output schema.md
# Pipe directly from curl
curl https://api.example.com/users | sniff-schema - --format typescript
# CSV works too
sniff-schema report.csv --format pydantic
# Large files — sample just 100 rows
sniff-schema big_dataset.csv --sample 100
What it detects
| Thing | How |
|---|---|
| Integers, floats, booleans, strings | Native Python types |
| ISO 8601 dates / datetimes | Regex match (2024-01-15, 2024-01-15T10:30:00Z) |
| Numeric strings, boolean strings | Pattern matching on string values |
| Nullable fields | Counts null, missing keys, and empty strings |
| Mixed types | Flags when a field contains more than one type |
| Nested JSON | Flattens to dot-notation (user.address.city) |
| JSON arrays | Handles root arrays, wrapped arrays (data, results, items) |
| NDJSON / JSON Lines | Auto-detected line-by-line |
| CSV dialect | Auto-sniffed (comma, tab, pipe, etc.) |
Output formats
--format typescript
interface InferredSchema {
id: number;
name: string;
age?: number | null;
email?: string | null;
score?: number | null;
active: boolean;
created_at: string;
country: string;
notes?: string | null;
}
// Inferred from 5 record(s)
--format pydantic
from pydantic import BaseModel
from typing import Optional
class InferredSchema(BaseModel):
id: int
name: str
age: Optional[int] = None
email: Optional[str] = None
score: Optional[float] = None
active: bool
created_at: str
country: str
notes: Optional[str] = None
# Inferred from 5 record(s)
--format markdown
Outputs a GitHub-flavored Markdown table — paste directly into your wiki, Notion, or PR description.
All options
Arguments:
SOURCE Path to JSON/CSV file, or '-' to read from stdin.
Options:
-f, --format [table|typescript|pydantic|markdown] Output format (default: table)
-s, --sample Max records to sample (default: 200, max: 100000)
-o, --output Write output to a file instead of stdout
--help Show this message and exit.
Where to install / publish this tool
| Platform | Command | Notes |
|---|---|---|
| PyPI | pip install sniff-schema |
Primary distribution |
| Homebrew (tap) | brew install gitwingo/tap/sniff-schema |
macOS/Linux users who avoid pip |
| Conda-forge | conda install sniff-schema |
Data science audience |
| GitHub Releases | Single-file .py download |
For users who want zero install |
| pipx | pipx install sniff-schema |
Isolated install, great for CLI tools |
Recommended publishing order: PyPI first → GitHub Release (attach
sniff_schema.py) → Conda-forge (after traction) → Homebrew tap.
Why conda-forge matters for this tool specifically
sniff-schema is uniquely well-suited for conda-forge because its primary audience — data scientists and ML engineers — overwhelmingly use conda environments. Submitting a conda-forge recipe makes it a first-class citizen in that ecosystem. See the conda-forge contribution docs to submit a recipe after publishing to PyPI.
Development
git clone https://github.com/gitwingo/sniff-schema
cd sniff-schema
pip install -e ".[dev]"
Support
If sniff-schema has been useful to you, consider supporting its development:
Connect
- GitHub: @gitwingo
- Reddit: u/gitwingo
- X / Twitter: @gitwingo
License
MIT © gitwingo
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sniff_schema-0.1.0.tar.gz.
File metadata
- Download URL: sniff_schema-0.1.0.tar.gz
- Upload date:
- Size: 100.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
843d1301a38eedda23e20e35e2e15e5f00efd8306d922d67db0e2ec6c6a88856
|
|
| MD5 |
106ee0e3931fb187847c458e678e4b51
|
|
| BLAKE2b-256 |
2fd61c4fdbb66ad6ce9357366efa05338036b3bb39c91266a32f7076a97328a1
|
Provenance
The following attestation bundles were made for sniff_schema-0.1.0.tar.gz:
Publisher:
publish.yml on gitwingo/sniff-schema
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sniff_schema-0.1.0.tar.gz -
Subject digest:
843d1301a38eedda23e20e35e2e15e5f00efd8306d922d67db0e2ec6c6a88856 - Sigstore transparency entry: 1429564454
- Sigstore integration time:
-
Permalink:
gitwingo/sniff-schema@f055fa6445aa9ae55ff93652f0a36e80bf64eae3 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/gitwingo
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@f055fa6445aa9ae55ff93652f0a36e80bf64eae3 -
Trigger Event:
release
-
Statement type:
File details
Details for the file sniff_schema-0.1.0-py3-none-any.whl.
File metadata
- Download URL: sniff_schema-0.1.0-py3-none-any.whl
- Upload date:
- Size: 9.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2f98b032c4faef65c8c8c4a74db2b1d4f1953b352141f77cf2d328d4952926e6
|
|
| MD5 |
d72a94a5024a3508ea297ccc3a3b845b
|
|
| BLAKE2b-256 |
f5bb98fe24df3ddeb7919ff6b0c5439d88a4d0076ed34c8cffed0ed151bf615b
|
Provenance
The following attestation bundles were made for sniff_schema-0.1.0-py3-none-any.whl:
Publisher:
publish.yml on gitwingo/sniff-schema
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sniff_schema-0.1.0-py3-none-any.whl -
Subject digest:
2f98b032c4faef65c8c8c4a74db2b1d4f1953b352141f77cf2d328d4952926e6 - Sigstore transparency entry: 1429564462
- Sigstore integration time:
-
Permalink:
gitwingo/sniff-schema@f055fa6445aa9ae55ff93652f0a36e80bf64eae3 -
Branch / Tag:
refs/tags/v0.1.0 - Owner: https://github.com/gitwingo
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@f055fa6445aa9ae55ff93652f0a36e80bf64eae3 -
Trigger Event:
release
-
Statement type: