Skip to main content

A simple, LLM-friendly schema definition library

Project description

Simple Schema

A simple, LLM-friendly schema definition library for Python that converts intuitive string syntax into structured data schemas.

๐Ÿš€ Quick Start

Installation

pip install simple-schema

30-Second Example

from simple_schema import string_to_json_schema

# Define a schema with simple, readable syntax
schema = string_to_json_schema("name:string, email:email, age:int?")

# Get a complete JSON Schema ready for validation
print(schema)
# {
#   "type": "object",
#   "properties": {
#     "name": {"type": "string"},
#     "email": {"type": "string", "format": "email"},
#     "age": {"type": "integer"}
#   },
#   "required": ["name", "email"]
# }

Create Pydantic Models Directly

from simple_schema import string_to_model

# Create Pydantic model from string syntax
UserModel = string_to_model("name:string, email:email, age:int?")

# Use immediately
user = UserModel(name="Alice", email="alice@example.com")
print(user.model_dump_json())  # {"name": "Alice", "email": "alice@example.com"}

๐ŸŽฏ What Simple Schema Does

Simple Schema takes human-readable text descriptions and converts them into structured schemas for data validation, extraction, and API documentation. Perfect for LLM data extraction, API development, and configuration validation.

Input: Human-readable string syntax Output: JSON Schema, Pydantic models, or OpenAPI specifications

๐Ÿš€ Core Functions & Use Cases

๐Ÿ“ Schema Conversion Matrix

Forward Conversions (Source โ†’ Target):

Function Input Output Use Case
string_to_json_schema() String syntax JSON Schema dict Main conversion - string to JSON Schema
string_to_model() String syntax Pydantic model class Direct path - string to Pydantic model
string_to_model_code() String syntax Python code string Code generation - for templates
string_to_openapi() String syntax OpenAPI schema dict Direct path - string to OpenAPI
json_schema_to_model() JSON Schema dict Pydantic model class When you already have JSON Schema
json_schema_to_openapi() JSON Schema dict OpenAPI schema dict When you already have JSON Schema

Reverse Conversions (Target โ†’ Source):

Function Input Output Use Case
model_to_string() Pydantic model String syntax Schema introspection - model to string
model_to_json_schema() Pydantic model JSON Schema dict Export - model to JSON Schema
json_schema_to_string() JSON Schema dict String syntax Migration - JSON Schema to string
openapi_to_string() OpenAPI schema String syntax Import - OpenAPI to string
openapi_to_json_schema() OpenAPI schema JSON Schema dict Conversion - OpenAPI to JSON Schema

๐Ÿ” Data Validation Functions

Function Input Output Use Case
validate_to_dict() Data + schema Validated dict API responses - clean dicts
validate_to_model() Data + schema Pydantic model Business logic - typed models
validate_string_syntax() String syntax Validation result Check syntax and get feedback

๐ŸŽจ Function Decorators

Decorator Purpose Returns Use Case
@returns_dict() Auto-validate to dict Validated dict API endpoints
@returns_model() Auto-validate to model Pydantic model Business logic

๐Ÿ”ง Utility Functions

Function Purpose Returns Use Case
get_model_info() Model introspection Model details dict Debugging & analysis
validate_schema_compatibility() Schema validation Compatibility info Schema validation

๐ŸŽฏ Key Scenarios

  • ๐Ÿค– LLM Data Extraction: Define extraction schemas that LLMs can easily follow
  • ๐Ÿ”ง API Development: Generate Pydantic models and OpenAPI docs from simple syntax
  • โœ… Data Validation: Create robust validation schemas with minimal code
  • ๐Ÿ“‹ Configuration: Define and validate application configuration schemas
  • ๐Ÿ”„ Data Transformation: Convert between different schema formats

โœ… Validate Data

from simple_schema import validate_to_dict, validate_to_model

# Example raw data (from API, user input, etc.)
raw_data = {
    "name": "John Doe",
    "email": "john@example.com",
    "age": "25",  # String that needs conversion
    "extra_field": "ignored"  # Will be filtered out
}

# Validate to clean dictionaries (perfect for API responses)
user_dict = validate_to_dict(raw_data, "name:string, email:email, age:int?")
print(user_dict)  # {"name": "John Doe", "email": "john@example.com", "age": 25}

# Validate to typed models (perfect for business logic)
user_model = validate_to_model(raw_data, "name:string, email:email, age:int?")
print(user_model.name)  # "John Doe" - Full type safety
print(user_model.age)   # 25 - Converted to int

๐ŸŽจ Function Decorators

from simple_schema import returns_dict, returns_model
import uuid

# Auto-validate function returns to dicts
@returns_dict("id:string, name:string, active:bool")
def create_user(name):
    # Input: "Alice"
    # Function returns unvalidated dict
    return {"id": str(uuid.uuid4()), "name": name, "active": True, "extra": "ignored"}
    # Output: {"id": "123e4567-...", "name": "Alice", "active": True}
    # Note: 'extra' field filtered out, types validated

# Auto-validate function returns to models
@returns_model("name:string, email:string")
def process_user(raw_input):
    # Input: {"name": "Bob", "email": "bob@test.com", "junk": "data"}
    # Function returns unvalidated dict
    return {"name": raw_input["name"], "email": raw_input["email"], "junk": "data"}
    # Output: UserModel(name="Bob", email="bob@test.com")
    # Note: Returns typed Pydantic model, 'junk' field filtered out

# Usage examples:
user_dict = create_user("Alice")  # Returns validated dict
user_model = process_user({"name": "Bob", "email": "bob@test.com"})  # Returns Pydantic model
print(user_model.name)  # "Bob" - Full type safety

๐ŸŒ FastAPI Integration

from simple_schema import string_to_model, returns_dict

# Create models for FastAPI
UserRequest = string_to_model("name:string, email:email")

@app.post("/users")
@returns_dict("id:int, name:string, email:string")
def create_user_endpoint(user: UserRequest):
    return {"id": 123, "name": user.name, "email": user.email}

Features: Arrays [{name:string}], nested objects {profile:{bio:text?}}, enums, constraints, decorators.

๐Ÿ“– Complete Documentation

๐ŸŽ“ More Examples

๐ŸŒฑ Simple Example - Basic User Data

from simple_schema import string_to_json_schema

# Define schema using intuitive string syntax
schema = string_to_json_schema("""
name:string(min=1, max=100),
email:email,
age:int(0, 120)?,
active:bool
""")

# Result: JSON Schema dictionary ready for validation
print(schema)
# {
#   "type": "object",
#   "properties": {
#     "name": {"type": "string", "minLength": 1, "maxLength": 100},
#     "email": {"type": "string", "format": "email"},
#     "age": {"type": "integer", "minimum": 0, "maximum": 120},
#     "active": {"type": "boolean"}
#   },
#   "required": ["name", "email", "active"]
# }

๐ŸŒณ Moderately Complex Example - Product Data

# Product with arrays and nested objects
schema = string_to_json_schema("""
{
    id:uuid,
    name:string,
    price:number(min=0),
    tags:[string]?,
    inventory:{
        in_stock:bool,
        quantity:int?
    }
}
""")

# Result: Complete JSON Schema with nested objects and arrays
print(schema)  # Full JSON Schema ready for validation

๐Ÿ“š For more examples and advanced syntax, see our detailed documentation

๐Ÿ”„ Output Formats & Results

๐Ÿ Pydantic Models (Python Classes)

from simple_schema import string_to_model

# Direct conversion: String syntax โ†’ Pydantic model
UserModel = string_to_model("name:string, email:email, active:bool")

# Use the model for validation
user = UserModel(name="John Doe", email="john@example.com", active=True)
print(user.model_dump_json())  # {"name": "John Doe", "email": "john@example.com", "active": true}

๐Ÿ”ง Code Generation (For Templates & Tools)

from simple_schema import string_to_model_code

# Generate Pydantic model code as a string
code = string_to_model_code("User", "name:string, email:email, active:bool")
print(code)

# Output:
# from pydantic import BaseModel
# from typing import Optional
#
# class User(BaseModel):
#     name: str
#     email: str
#     active: bool

# Perfect for code generators, templates, or saving to files
with open('models.py', 'w') as f:
    f.write(code)

๐ŸŒ OpenAPI Schemas (API Documentation)

from simple_schema import string_to_openapi

# Direct conversion: String syntax โ†’ OpenAPI schema
openapi_schema = string_to_openapi("name:string, email:email")
print(openapi_schema)
# {
#   "type": "object",
#   "properties": {
#     "name": {"type": "string"},
#     "email": {"type": "string", "format": "email"}
#   },
#   "required": ["name", "email"]
# }

๐Ÿ”„ Reverse Conversions (Universal Schema Converter)

Simple Schema provides complete bidirectional conversion between all schema formats!

โš ๏ธ Information Loss Notice: Reverse conversions (from JSON Schema/OpenAPI/Pydantic back to string syntax) may lose some information due to format differences. However, the resulting schemas are designed to cover the most common use cases and maintain functional equivalence for typical validation scenarios.

๐Ÿ” Schema Introspection

from simple_schema import model_to_string, string_to_model

# Create a model first
UserModel = string_to_model("name:string, email:email, active:bool")

# Reverse engineer it back to string syntax
schema_string = model_to_string(UserModel)
print(f"Model schema: {schema_string}")
# Output: "name:string, email:email, active:bool"

๐Ÿ“ฆ Migration & Import

from simple_schema import json_schema_to_string

# Example: Convert existing JSON Schema to Simple Schema syntax
json_schema = {
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "email": {"type": "string", "format": "email"},
        "age": {"type": "integer"}
    },
    "required": ["name", "email"]
}

simple_syntax = json_schema_to_string(json_schema)
print(f"Converted: {simple_syntax}")
# Output: "name:string, email:email, age:int?"

๐Ÿ”ง Schema Comparison & Analysis

from simple_schema import string_to_model, model_to_string

# Create two versions of a model
UserV1 = string_to_model("name:string, email:email")
UserV2 = string_to_model("name:string, email:email, active:bool")

# Compare them
v1_str = model_to_string(UserV1)
v2_str = model_to_string(UserV2)

print("Schema changes:")
print(f"V1: {v1_str}")  # "name:string, email:email"
print(f"V2: {v2_str}")  # "name:string, email:email, active:bool"

๐ŸŽจ String Syntax Reference

Basic Types

  • string, int, number, bool โ†’ Basic data types
  • email, url, datetime, date, uuid, phone โ†’ Special validated types

Field Modifiers

  • field_name:type โ†’ Required field
  • field_name:type? โ†’ Optional field
  • field_name:type(constraints) โ†’ Field with validation

Common Patterns

  • string(min=1, max=100) โ†’ Length constraints
  • int(0, 120) โ†’ Range constraints
  • [string] โ†’ Simple arrays
  • [{name:string, email:email}] โ†’ Object arrays
  • status:enum(active, inactive) โ†’ Enum values
  • id:string|uuid โ†’ Union types

๐Ÿ“– Complete syntax guide: See docs/string-syntax.md for full reference

โœ… Validation

from simple_schema import validate_string_syntax

# Validate your schema syntax
result = validate_string_syntax("name:string, email:email, age:int?")

print(f"Valid: {result['valid']}")  # True
print(f"Features used: {result['features_used']}")  # ['basic_types', 'optional_fields']
print(f"Field count: {len(result['parsed_fields'])}")  # 3

# Example with invalid syntax
bad_result = validate_string_syntax("name:invalid_type")
print(f"Valid: {bad_result['valid']}")  # False
print(f"Errors: {bad_result['errors']}")  # ['Unknown type: invalid_type']

๐Ÿ—๏ธ Common Use Cases

๐Ÿค– LLM Data Extraction

from simple_schema import string_to_json_schema

# Define what data to extract
schema = string_to_json_schema("company:string, employees:[{name:string, email:email}], founded:int?")
# Use schema in LLM prompts for structured data extraction

๐Ÿ”ง FastAPI Development

from simple_schema import string_to_model

# Create model for FastAPI
UserModel = string_to_model("name:string, email:email")

# Use in FastAPI endpoints
@app.post("/users/")
async def create_user(user: UserModel):
    return {"id": 123, "name": user.name, "email": user.email}

๐Ÿ—๏ธ Code Generation & Templates

from simple_schema import string_to_model_code

# Generate model code
code = string_to_model_code("User", "name:string, email:email")
print(code)
# Output: Complete Pydantic model class as string

# Save to file
with open('user_model.py', 'w') as f:
    f.write(code)

๐Ÿ“‹ Configuration Validation

from simple_schema import string_to_json_schema

# Validate app configuration
config_schema = string_to_json_schema("database:{host:string, port:int}, debug:bool")
# Use for validating config files

๐Ÿ“š Documentation

๐Ÿ“‹ Example Schemas

Ready-to-use schema examples are available in the examples/ directory:

# Import example schemas if needed
from simple_schema.examples.presets import user_schema, product_schema
from simple_schema.examples.recipes import create_ecommerce_product_schema

# Or better yet, use string syntax directly:
user_schema = string_to_json_schema("name:string, email:email, age:int?")

๐Ÿงช Testing

The library includes comprehensive tests covering all functionality:

# Run tests (requires pytest)
pip install pytest
pytest tests/

# Results: 66 tests passed, 3 skipped (Pydantic tests when not installed)

๐Ÿค Contributing

Contributions are welcome! The codebase is well-organized and documented:

simple_schema/
โ”œโ”€โ”€ core/          # Core functionality (fields, builders, validators)
โ”œโ”€โ”€ parsing/       # String parsing and syntax
โ”œโ”€โ”€ integrations/  # Pydantic, JSON Schema, OpenAPI
โ””โ”€โ”€ examples/      # Built-in schemas and recipes

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


Simple Schema - Making data validation simple, intuitive, and LLM-friendly! ๐Ÿš€

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

schema_string-0.1.0.tar.gz (71.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

schema_string-0.1.0-py3-none-any.whl (52.0 kB view details)

Uploaded Python 3

File details

Details for the file schema_string-0.1.0.tar.gz.

File metadata

  • Download URL: schema_string-0.1.0.tar.gz
  • Upload date:
  • Size: 71.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for schema_string-0.1.0.tar.gz
Algorithm Hash digest
SHA256 fa37484c0b2145864f90de2b700f69416f93662b7904a2751bd1355c2da9a587
MD5 9f2fffab8630d2c502a7c7508a7b22ce
BLAKE2b-256 4e8bfd5150f3bc1680f91bd0d323ba7c5421747dcfe687555dbceb255966e5b8

See more details on using hashes here.

File details

Details for the file schema_string-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: schema_string-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 52.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.3

File hashes

Hashes for schema_string-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 bcaf9935e9ef46475c3b01a249269c9bfc2124adcdabf69f9b5adf9955da56bf
MD5 a4f7460316bc8b373c189f6fd493a8fc
BLAKE2b-256 dd6415c47bc912df9d1407d13d63689db1ce145dae2f04e1930abe344fcb9b19

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page