Synthetic data generation that actually doesn't hurt.
Project description
sintezi
Synthetic data generation that actually doesn't hurt.
A type-safe Python library for generating synthetic data using LLMs. Built with structured outputs, automatic retry policies, and support for multiple response formats (JSON, XML).
Why sintezi? Unlike general-purpose LLM frameworks (LangChain, LlamaIndex), sintezi is focused on bulk synthetic data generation with explicit developer control:
- Bulk generation first — optimized for creating large synthetic datasets, not building chatbots or agents
- Explicit control — you define formats, parsers, and retry logic; no hidden prompt engineering or magic
- Simple by design — no memory systems, RAG pipelines, or high-level abstractions; just clean, predictable data generation
If you need agentic workflows, memory, or RAG, use LangChain. If you need to generate 10,000 structured examples with full control, use sintezi.
Features
- Type-safe — Pydantic models for requests and responses with full type hints
- Multiple formats — JSON, XML, plain text, or custom formatters
- Smart retry — Separate retry policies for network errors and validation failures
- Auto-parsing — Automatic format selection based on Pydantic models
- LLM-agnostic — Works with any OpenAI-compatible API
Installation
pip install sintezi
Requirements: Python 3.11+
Quick start
from pydantic import BaseModel
from openai import AsyncOpenAI
from sintezi.ai.context import ai_context_from_openai
from sintezi.ai.executor import StructuredAiCall, StructuredAiCallConfig, AiCallParameters
from sintezi.ai.formatter import auto_formatter_for_type
from sintezi.ai.parser import auto_parser_for_type
class ProductInfo(BaseModel):
name: str
category: str
class ProductDescription(BaseModel):
description: str
# Setup
client = AsyncOpenAI(api_key="your-api-key")
ctx = ai_context_from_openai(client)
config = StructuredAiCallConfig(
system_message="Generate product descriptions.",
parameters=AiCallParameters(model="gpt-4o-mini"),
)
ai_call = StructuredAiCall(
ctx=ctx,
config=config,
formatter=auto_formatter_for_type(ProductInfo),
parser=auto_parser_for_type(ProductDescription),
)
# Generate
product = ProductInfo(name="Laptop", category="Electronics")
result = await ai_call(product)
print(result.description)
See the quick start guide for a complete walkthrough.
Documentation
Full documentation: https://mrapplexz.github.io/sintezi/
- Quick start guide — complete walkthrough with examples
- Executors — available AI call executors
- Formatters — JSON, XML, custom formats
- Parsers — response parsing and validation
- Retry policies — network and validation retry configuration
- API Reference — complete API documentation
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sintezi-0.2.0.tar.gz.
File metadata
- Download URL: sintezi-0.2.0.tar.gz
- Upload date:
- Size: 9.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d90d5f2f2d7680f9882c39b802fe2beda37ed2a92c0df7ef6ecc7309c2b14d34
|
|
| MD5 |
1e9151d0653fd489115c4b9da830c482
|
|
| BLAKE2b-256 |
092f7e41f475da80c78a4634cafee9633b15f3e52e2b5e1b8a372a6b48e66871
|
Provenance
The following attestation bundles were made for sintezi-0.2.0.tar.gz:
Publisher:
release.yml on mrapplexz/sintezi
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sintezi-0.2.0.tar.gz -
Subject digest:
d90d5f2f2d7680f9882c39b802fe2beda37ed2a92c0df7ef6ecc7309c2b14d34 - Sigstore transparency entry: 1631051316
- Sigstore integration time:
-
Permalink:
mrapplexz/sintezi@2f24e36cca8d66a66b3d4ae44477102af14da9d6 -
Branch / Tag:
refs/heads/master - Owner: https://github.com/mrapplexz
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@2f24e36cca8d66a66b3d4ae44477102af14da9d6 -
Trigger Event:
push
-
Statement type:
File details
Details for the file sintezi-0.2.0-py3-none-any.whl.
File metadata
- Download URL: sintezi-0.2.0-py3-none-any.whl
- Upload date:
- Size: 13.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
072bb2bd715563538171d85b50261da69f5e702b22f473203094401059ec3d23
|
|
| MD5 |
5aacaf304d1b82d6dc7fb8c8f934f995
|
|
| BLAKE2b-256 |
e0d83a3e672aeddfdc0b1a5c74caada58525ecca579a95343e6bb80fe97bd46c
|
Provenance
The following attestation bundles were made for sintezi-0.2.0-py3-none-any.whl:
Publisher:
release.yml on mrapplexz/sintezi
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sintezi-0.2.0-py3-none-any.whl -
Subject digest:
072bb2bd715563538171d85b50261da69f5e702b22f473203094401059ec3d23 - Sigstore transparency entry: 1631051359
- Sigstore integration time:
-
Permalink:
mrapplexz/sintezi@2f24e36cca8d66a66b3d4ae44477102af14da9d6 -
Branch / Tag:
refs/heads/master - Owner: https://github.com/mrapplexz
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@2f24e36cca8d66a66b3d4ae44477102af14da9d6 -
Trigger Event:
push
-
Statement type: