Lightweight orchestration toolkit to generate, validate, repair and enforce structured output from LLMs

These details have not been verified by PyPI

Project links

Project description

parsec

⚡ Lightweight orchestration toolkit to generate, validate, repair and enforce structured output from large language models (LLMs). The project provides a provider-agnostic adapter interface, validators (JSON/Pydantic), prompt template management with versioning, caching, dataset collection, and an enforcement engine that retries and repairs LLM output until it conforms to a schema.

This repository contains:

Adapter abstractions for OpenAI, Anthropic, and Google Gemini.
Validation and repair utilities for JSON and Pydantic schemas.
An EnforcementEngine that generates, validates, repairs, and retries.
Prompt template system with versioning and YAML persistence.
LRU caching to reduce redundant API calls and costs.
Dataset collection for training and fine-tuning.
Examples and comprehensive test suite.

Features

Core Enforcement

Provider-agnostic adapters: OpenAI, Anthropic (Claude), Google Gemini
Multiple validators: JSON Schema, Pydantic models
Automatic repair: Schema-based heuristics fix common formatting issues
Retry loop: Progressive feedback to model for iterative repair
Dataset collection: Capture and export training data (JSONL, JSON, CSV)

Prompt Management

Template system: Type-safe variable substitution with validation
Version control: Semantic versioning (1.0.0, 2.0.0, etc.)
YAML persistence: Save/load templates from files
Template registry: Centralized management of all templates
Template manager: One-line API for template + enforcement

Performance & Caching

LRU cache: In-memory caching with TTL support
Cost reduction: Avoid redundant API calls for identical requests
Cache integration: Seamless integration with enforcement engine
Statistics tracking: Monitor cache hits, misses, and hit rates

Installation

pip install parsec-llm

Or for development:

git clone https://github.com/olliekm/parsec.git
cd parsec
pip install -e ".[dev]"

Quick Start

Basic Usage

from parsec.models.adapters import OpenAIAdapter
from parsec.validators import JSONValidator
from parsec.enforcement import EnforcementEngine

# Set up components
adapter = OpenAIAdapter(api_key="your-api-key", model="gpt-4o-mini")
validator = JSONValidator()
engine = EnforcementEngine(adapter, validator, max_retries=3)

# Define your schema
schema = {
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "age": {"type": "integer"}
    },
    "required": ["name", "age"]
}

# Enforce structured output
result = await engine.enforce(
    "Extract: John Doe is 30 years old",
    schema
)

print(result.data)  # {"name": "John Doe", "age": 30}
print(result.success)  # True
print(result.retry_count)  # 0

With Caching

from parsec.cache import InMemoryCache

# Add cache to reduce redundant API calls
cache = InMemoryCache(max_size=100, default_ttl=3600)
engine = EnforcementEngine(adapter, validator, cache=cache)

# First call hits API
result1 = await engine.enforce(prompt, schema)

# Second identical call returns cached result (no API call!)
result2 = await engine.enforce(prompt, schema)

# Check cache performance
stats = cache.get_stats()
print(stats)  # {'hits': 1, 'misses': 1, 'hit_rate': '50.00%'}

With Prompt Templates

from parsec.prompts import PromptTemplate, TemplateRegistry, TemplateManager

# Create a reusable template
template = PromptTemplate(
    name="extract_person",
    template="Extract person info from: {text}\n\nReturn as JSON.",
    variables={"text": str},
    required=["text"]
)

# Register with version
registry = TemplateRegistry()
registry.register(template, "1.0.0")

# Use with enforcement
manager = TemplateManager(registry, engine)
result = await manager.enforce_with_template(
    template_name="extract_person",
    variables={"text": "John Doe, age 30"},
    schema=schema
)

# Save templates to file
registry.save_to_disk("templates.yaml")

# Load templates later
registry.load_from_disk("templates.yaml")

With Pydantic Models

from pydantic import BaseModel
from parsec.validators import PydanticValidator

class Person(BaseModel):
    name: str
    age: int
    email: str

validator = PydanticValidator()
engine = EnforcementEngine(adapter, validator)

result = await engine.enforce(
    "Extract: John Doe, 30 years old, john@example.com",
    Person
)

print(result.data)  # {"name": "John Doe", "age": 30, "email": "john@example.com"}

Development Setup

Requirements: Python 3.9+

Install dependencies:

pip install -e ".[dev]"

Run tests:

poetry run pytest -q

Run the OpenAI example (requires OPENAI_API_KEY):

export OPENAI_API_KEY="sk-..."
export OPENAI_MODEL="gpt-4o-mini"  # optional
poetry run python examples/run_with_openai.py

The example demonstrates using OpenAIAdapter, JSONValidator and EnforcementEngine to extract structured data using a JSON schema.

Code Structure

src/parsec/core/ — Core abstractions and schemas
src/parsec/models/ — LLM provider adapters (OpenAI, Anthropic, Gemini)
src/parsec/validators/ — Validator implementations (JSON, Pydantic)
src/parsec/enforcement/ — Enforcement and orchestration engine
src/parsec/prompts/ — Prompt template system with versioning
src/parsec/cache/ — Caching implementations (InMemoryCache)
src/parsec/training/ — Dataset collection for fine-tuning
src/parsec/utils/ — Utility functions (partial JSON parsing)
examples/ — Working examples with real API calls
tests/ — Comprehensive test suite with pytest

Examples

Check out the examples/ directory for complete working examples:

basic_usage.py - Simple extraction with JSON schema
prompt_template_example.py - Template system with versioning
prompt_persistence_example.py - Save/load templates from YAML
template_manager_example.py - TemplateManager integration
template_manager_live_example.py - Live demo with real API calls
streaming_example.py - Streaming support (experimental)

Run any example:

python3 examples/template_manager_live_example.py

Testing

Run the test suite with:

poetry run pytest -q

Advanced Features

Dataset Collection

Collect and export training data for fine-tuning:

from parsec.training import DatasetCollector

collector = DatasetCollector(
    output_path="./training_data",
    format="jsonl",  # or "json", "csv"
    auto_flush=True
)

engine = EnforcementEngine(adapter, validator, collector=collector)

# Data is automatically collected during enforcement
result = await engine.enforce(prompt, schema)

# Export collected data
collector.flush()  # Writes to disk

Template Versioning Workflow

# v1.0.0 - Initial template
template_v1 = PromptTemplate(
    name="extract_person",
    template="Extract: {text}",
    variables={"text": str},
    required=["text"]
)
registry.register(template_v1, "1.0.0")

# v2.0.0 - Improved with validation rules
template_v2 = PromptTemplate(
    name="extract_person",
    template="Extract: {text}\n\nValidation: {rules}",
    variables={"text": str, "rules": str},
    required=["text"],
    defaults={"rules": "Strict validation"}
)
registry.register(template_v2, "2.0.0")

# Use specific version
result = await manager.enforce_with_template(
    template_name="extract_person",
    version="2.0.0",  # Explicit version
    variables={"text": "John Doe, 30"}
)

# Or use latest automatically
result = await manager.enforce_with_template(
    template_name="extract_person",  # Gets v2.0.0
    variables={"text": "John Doe, 30"}
)

Multi-Provider Support

from parsec.models.adapters import OpenAIAdapter, AnthropicAdapter

# Switch between providers easily
openai_adapter = OpenAIAdapter(api_key=openai_key, model="gpt-4o-mini")
anthropic_adapter = AnthropicAdapter(api_key=anthropic_key, model="claude-3-5-sonnet-20241022")

# Same enforcement code works with any adapter
engine = EnforcementEngine(anthropic_adapter, validator)
result = await engine.enforce(prompt, schema)

Roadmap

Core enforcement engine with retry logic
Multiple LLM providers (OpenAI, Anthropic, Gemini)
JSON and Pydantic validation
LRU caching with TTL
Prompt template system with versioning
Dataset collection for training
Streaming support for real-time output
Batch processing with rate limiting
Cost tracking and analytics
A/B testing for prompt variants
Output post-processing pipeline

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Notes

Examples with real API calls will incur costs — use test/development API keys
The framework is intentionally modular — extend adapters and validators as needed
Template system supports version control via YAML files for team collaboration

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.1

Dec 7, 2025

This version

0.2.0

Dec 5, 2025

0.1.3

Dec 2, 2025

0.1.2

Nov 28, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

parsec_llm-0.2.0.tar.gz (28.2 kB view details)

Uploaded Dec 5, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

parsec_llm-0.2.0-py3-none-any.whl (33.6 kB view details)

Uploaded Dec 5, 2025 Python 3

File details

Details for the file parsec_llm-0.2.0.tar.gz.

File metadata

Download URL: parsec_llm-0.2.0.tar.gz
Upload date: Dec 5, 2025
Size: 28.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for parsec_llm-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`41065c68fd719430b5e87f156ce0de8d0f12f11280b994064987ac2321d2c3a2`
MD5	`57ba59ca32d293a55b5892a9e0d582e8`
BLAKE2b-256	`bdf4d1d5536d20ba3bcaf12bd6def16c88331453156a2f7c1b6dfe7a668f0219`

See more details on using hashes here.

File details

Details for the file parsec_llm-0.2.0-py3-none-any.whl.

File metadata

Download URL: parsec_llm-0.2.0-py3-none-any.whl
Upload date: Dec 5, 2025
Size: 33.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for parsec_llm-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`110eba85b7967c9eacf68333a055be5ce392bb3ed50fb3efcc5c4435ca60a138`
MD5	`988d3717be06d7bfd34d1c2b675b9a02`
BLAKE2b-256	`d058392ea9853327ff57b1622a529ab1bb98d7449475ab69eed55dec93145923`

See more details on using hashes here.

parsec-llm 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

parsec

Features

Core Enforcement

Prompt Management

Performance & Caching

Installation

Quick Start

Basic Usage

With Caching

With Prompt Templates

With Pydantic Models

Development Setup

Code Structure

Examples

Testing

Advanced Features

Dataset Collection

Template Versioning Workflow

Multi-Provider Support

Roadmap

Contributing

Notes

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes