A simple, LLM-friendly schema definition library for converting string syntax to structured schemas
Project description
String Schema
A simple, LLM-friendly schema definition library for Python that converts intuitive string syntax into structured data schemas.
๐ Quick Start
Installation
pip install string-schema
30-Second Example
from string_schema import string_to_json_schema
# Define a schema with simple, readable syntax
schema = string_to_json_schema("name:string, email:email, age:int?")
# Get a complete JSON Schema ready for validation
print(schema)
# {
# "type": "object",
# "properties": {
# "name": {"type": "string"},
# "email": {"type": "string", "format": "email"},
# "age": {"type": "integer"}
# },
# "required": ["name", "email"]
# }
Create Pydantic Models Directly
from string_schema import string_to_model
# Create Pydantic model from string syntax
UserModel = string_to_model("name:string, email:email, age:int?")
# Use immediately
user = UserModel(name="Alice", email="alice@example.com")
print(user.model_dump_json()) # {"name": "Alice", "email": "alice@example.com"}
๐ฏ What String Schema Does
String Schema takes human-readable text descriptions and converts them into structured schemas for data validation, extraction, and API documentation. Perfect for LLM data extraction, API development, and configuration validation.
Input: Human-readable string syntax Output: JSON Schema, Pydantic models, or OpenAPI specifications
๐ Core Functions & Use Cases
๐ Schema Conversion Matrix
Forward Conversions (Source โ Target):
| Function | Input | Output | Use Case |
|---|---|---|---|
string_to_json_schema() |
String syntax | JSON Schema dict | Main conversion - string to JSON Schema |
string_to_model() |
String syntax | Pydantic model class | Direct path - string to Pydantic model |
string_to_model_code() |
String syntax | Python code string | Code generation - for templates |
string_to_openapi() |
String syntax | OpenAPI schema dict | Direct path - string to OpenAPI |
json_schema_to_model() |
JSON Schema dict | Pydantic model class | When you already have JSON Schema |
json_schema_to_openapi() |
JSON Schema dict | OpenAPI schema dict | When you already have JSON Schema |
Reverse Conversions (Target โ Source):
| Function | Input | Output | Use Case |
|---|---|---|---|
model_to_string() |
Pydantic model | String syntax | Schema introspection - model to string |
model_to_json_schema() |
Pydantic model | JSON Schema dict | Export - model to JSON Schema |
json_schema_to_string() |
JSON Schema dict | String syntax | Migration - JSON Schema to string |
openapi_to_string() |
OpenAPI schema | String syntax | Import - OpenAPI to string |
openapi_to_json_schema() |
OpenAPI schema | JSON Schema dict | Conversion - OpenAPI to JSON Schema |
๐ Data Validation Functions
| Function | Input | Output | Use Case |
|---|---|---|---|
validate_to_dict() |
Data + schema | Validated dict | API responses - clean dicts |
validate_to_model() |
Data + schema | Pydantic model | Business logic - typed models |
validate_string_syntax() |
String syntax | Validation result | Check syntax and get feedback |
๐จ Function Decorators
| Decorator | Purpose | Returns | Use Case |
|---|---|---|---|
@returns_dict() |
Auto-validate to dict | Validated dict | API endpoints |
@returns_model() |
Auto-validate to model | Pydantic model | Business logic |
๐ง Utility Functions
| Function | Purpose | Returns | Use Case |
|---|---|---|---|
get_model_info() |
Model introspection | Model details dict | Debugging & analysis |
validate_schema_compatibility() |
Schema validation | Compatibility info | Schema validation |
๐ฏ Key Scenarios
- ๐ค LLM Data Extraction: Define extraction schemas that LLMs can easily follow
- ๐ง API Development: Generate Pydantic models and OpenAPI docs from simple syntax
- โ Data Validation: Create robust validation schemas with minimal code
- ๐ Configuration: Define and validate application configuration schemas
- ๐ Data Transformation: Convert between different schema formats
โ Validate Data
from string_schema import validate_to_dict, validate_to_model
# Example raw data (from API, user input, etc.)
raw_data = {
"name": "John Doe",
"email": "john@example.com",
"age": "25", # String that needs conversion
"extra_field": "ignored" # Will be filtered out
}
# Validate to clean dictionaries (perfect for API responses)
user_dict = validate_to_dict(raw_data, "name:string, email:email, age:int?")
print(user_dict) # {"name": "John Doe", "email": "john@example.com", "age": 25}
# Validate to typed models (perfect for business logic)
user_model = validate_to_model(raw_data, "name:string, email:email, age:int?")
print(user_model.name) # "John Doe" - Full type safety
print(user_model.age) # 25 - Converted to int
๐จ Function Decorators
from string_schema import returns_dict, returns_model
import uuid
# Auto-validate function returns to dicts
@returns_dict("id:string, name:string, active:bool")
def create_user(name):
# Input: "Alice"
# Function returns unvalidated dict
return {"id": str(uuid.uuid4()), "name": name, "active": True, "extra": "ignored"}
# Output: {"id": "123e4567-...", "name": "Alice", "active": True}
# Note: 'extra' field filtered out, types validated
# Auto-validate function returns to models
@returns_model("name:string, email:string")
def process_user(raw_input):
# Input: {"name": "Bob", "email": "bob@test.com", "junk": "data"}
# Function returns unvalidated dict
return {"name": raw_input["name"], "email": raw_input["email"], "junk": "data"}
# Output: UserModel(name="Bob", email="bob@test.com")
# Note: Returns typed Pydantic model, 'junk' field filtered out
# Usage examples:
user_dict = create_user("Alice") # Returns validated dict
user_model = process_user({"name": "Bob", "email": "bob@test.com"}) # Returns Pydantic model
print(user_model.name) # "Bob" - Full type safety
๐ FastAPI Integration
from string_schema import string_to_model, returns_dict
# Create models for FastAPI
UserRequest = string_to_model("name:string, email:email")
@app.post("/users")
@returns_dict("id:int, name:string, email:string")
def create_user_endpoint(user: UserRequest):
return {"id": 123, "name": user.name, "email": user.email}
Features: Arrays [{name:string}], nested objects {profile:{bio:text?}}, enums, constraints, decorators.
๐ More Examples
๐ฑ Simple Example - Basic User Data
from string_schema import string_to_json_schema
# Define schema using intuitive string syntax
schema = string_to_json_schema("""
name:string(min=1, max=100),
email:email,
age:int(0, 120)?,
active:bool
""")
# Result: JSON Schema dictionary ready for validation
print(schema)
# {
# "type": "object",
# "properties": {
# "name": {"type": "string", "minLength": 1, "maxLength": 100},
# "email": {"type": "string", "format": "email"},
# "age": {"type": "integer", "minimum": 0, "maximum": 120},
# "active": {"type": "boolean"}
# },
# "required": ["name", "email", "active"]
# }
๐ณ Moderately Complex Example - Product Data
# Product with arrays and nested objects
schema = string_to_json_schema("""
{
id:uuid,
name:string,
price:number(min=0),
tags:[string]?,
inventory:{
in_stock:bool,
quantity:int?
}
}
""")
# Result: Complete JSON Schema with nested objects and arrays
print(schema) # Full JSON Schema ready for validation
๐ For more examples and advanced syntax, see our detailed documentation
๐ Output Formats & Results
๐ Pydantic Models (Python Classes)
from string_schema import string_to_model
# Direct conversion: String syntax โ Pydantic model
UserModel = string_to_model("name:string, email:email, active:bool")
# Use the model for validation
user = UserModel(name="John Doe", email="john@example.com", active=True)
print(user.model_dump_json()) # {"name": "John Doe", "email": "john@example.com", "active": true}
๐ง Code Generation (For Templates & Tools)
from string_schema import string_to_model_code
# Generate Pydantic model code as a string
code = string_to_model_code("User", "name:string, email:email, active:bool")
print(code)
# Output:
# from pydantic import BaseModel
# from typing import Optional
#
# class User(BaseModel):
# name: str
# email: str
# active: bool
# Perfect for code generators, templates, or saving to files
with open('models.py', 'w') as f:
f.write(code)
๐ OpenAPI Schemas (API Documentation)
from string_schema import string_to_openapi
# Direct conversion: String syntax โ OpenAPI schema
openapi_schema = string_to_openapi("name:string, email:email")
print(openapi_schema)
# {
# "type": "object",
# "properties": {
# "name": {"type": "string"},
# "email": {"type": "string", "format": "email"}
# },
# "required": ["name", "email"]
# }
๐ Reverse Conversions (Universal Schema Converter)
String Schema provides complete bidirectional conversion between all schema formats!
โ ๏ธ Information Loss Notice: Reverse conversions (from JSON Schema/OpenAPI/Pydantic back to string syntax) may lose some information due to format differences. However, the resulting schemas are designed to cover the most common use cases and maintain functional equivalence for typical validation scenarios.
๐ Schema Introspection
from string_schema import model_to_string, string_to_model
# Create a model first
UserModel = string_to_model("name:string, email:email, active:bool")
# Reverse engineer it back to string syntax
schema_string = model_to_string(UserModel)
print(f"Model schema: {schema_string}")
# Output: "name:string, email:email, active:bool"
๐ฆ Migration & Import
from string_schema import json_schema_to_string
# Example: Convert existing JSON Schema to String Schema syntax
json_schema = {
"type": "object",
"properties": {
"name": {"type": "string"},
"email": {"type": "string", "format": "email"},
"age": {"type": "integer"}
},
"required": ["name", "email"]
}
simple_syntax = json_schema_to_string(json_schema)
print(f"Converted: {simple_syntax}")
# Output: "name:string, email:email, age:int?"
๐ง Schema Comparison & Analysis
from string_schema import string_to_model, model_to_string
# Create two versions of a model
UserV1 = string_to_model("name:string, email:email")
UserV2 = string_to_model("name:string, email:email, active:bool")
# Compare them
v1_str = model_to_string(UserV1)
v2_str = model_to_string(UserV2)
print("Schema changes:")
print(f"V1: {v1_str}") # "name:string, email:email"
print(f"V2: {v2_str}") # "name:string, email:email, active:bool"
๐จ String Syntax Reference
Basic Types
string,int,number,boolโ Basic data typesemail,url,datetime,date,uuid,phoneโ Special validated types
Field Modifiers
field_name:typeโ Required fieldfield_name:type?โ Optional fieldfield_name:type(constraints)โ Field with validation
Common Patterns
string(min=1, max=100)โ Length constraintsint(0, 120)โ Range constraints[string]โ Simple arrays[{name:string, email:email}]โ Object arraysstatus:enum(active, inactive)โ Enum valuesid:string|uuidโ Union types
๐ Complete syntax guide: See docs/string-syntax.md for full reference
โ Validation
from string_schema import validate_string_syntax
# Validate your schema syntax
result = validate_string_syntax("name:string, email:email, age:int?")
print(f"Valid: {result['valid']}") # True
print(f"Features used: {result['features_used']}") # ['basic_types', 'optional_fields']
print(f"Field count: {len(result['parsed_fields'])}") # 3
# Example with invalid syntax
bad_result = validate_string_syntax("name:invalid_type")
print(f"Valid: {bad_result['valid']}") # False
print(f"Errors: {bad_result['errors']}") # ['Unknown type: invalid_type']
๐๏ธ Common Use Cases
๐ค LLM Data Extraction
from string_schema import string_to_json_schema
# Define what data to extract
schema = string_to_json_schema("company:string, employees:[{name:string, email:email}], founded:int?")
# Use schema in LLM prompts for structured data extraction
๐ง FastAPI Development
from string_schema import string_to_model
# Create model for FastAPI
UserModel = string_to_model("name:string, email:email")
# Use in FastAPI endpoints
@app.post("/users/")
async def create_user(user: UserModel):
return {"id": 123, "name": user.name, "email": user.email}
๐๏ธ Code Generation & Templates
from string_schema import string_to_model_code
# Generate model code
code = string_to_model_code("User", "name:string, email:email")
print(code)
# Output: Complete Pydantic model class as string
# Save to file
with open('user_model.py', 'w') as f:
f.write(code)
๐ Configuration Validation
from string_schema import string_to_json_schema
# Validate app configuration
config_schema = string_to_json_schema("database:{host:string, port:int}, debug:bool")
# Use for validating config files
๐ Documentation
- Getting Started - Quick start guide and installation
- String Syntax Guide - Complete syntax reference
- API Reference - Full API documentation
- Examples - Practical examples and patterns
๐ Example Schemas
Ready-to-use schema examples are available in the examples/ directory:
# Import example schemas if needed
from string_schema.examples.presets import user_schema, product_schema
from string_schema.examples.recipes import create_ecommerce_product_schema
# Or better yet, use string syntax directly:
user_schema = string_to_json_schema("name:string, email:email, age:int?")
๐งช Testing
The library includes comprehensive tests covering all functionality:
# Run tests (requires pytest)
pip install pytest
pytest tests/
# Results: 66 tests passed, 3 skipped (Pydantic tests when not installed)
๐ค Contributing
Contributions are welcome! The codebase is well-organized and documented:
string_schema/
โโโ core/ # Core functionality (fields, builders, validators)
โโโ parsing/ # String parsing and syntax
โโโ integrations/ # Pydantic, JSON Schema, OpenAPI
โโโ examples/ # Built-in schemas and recipes
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
String Schema - Making data validation simple, intuitive, and LLM-friendly! ๐
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file string_schema-0.1.4.tar.gz.
File metadata
- Download URL: string_schema-0.1.4.tar.gz
- Upload date:
- Size: 73.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4b6938acb58df4bc4a2bc5eb1316530364161352c9740f1779fe6ddd89c3afa6
|
|
| MD5 |
5ed2c226dfa5ee1afe965fe864334771
|
|
| BLAKE2b-256 |
91d2828706ecab8568761132ea9dcd07367593637b8c714a8de66555c6bd8c80
|
File details
Details for the file string_schema-0.1.4-py3-none-any.whl.
File metadata
- Download URL: string_schema-0.1.4-py3-none-any.whl
- Upload date:
- Size: 52.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4602f2f877f7aee3b6752f7bac92293524ae261e6bbb76c858d565ec0dc1ca24
|
|
| MD5 |
8ec212ee5fdc864f115b61f560a21d7b
|
|
| BLAKE2b-256 |
dd6e6b106537dd45779a0bb40e37e3faabea1fc15f39def1628bcfc3b82e7ebd
|