Skip to main content

Synthetic data generation that actually doesn't hurt.

Project description

sintezi

Synthetic data generation that actually doesn't hurt.

PyPI Python License CI Docs

A type-safe Python library for generating synthetic data using LLMs. Built with structured outputs, automatic retry policies, and support for multiple response formats (JSON, XML).

Why sintezi? Unlike general-purpose LLM frameworks (LangChain, LlamaIndex), sintezi is focused on bulk synthetic data generation with explicit developer control:

  • Bulk generation first — optimized for creating large synthetic datasets, not building chatbots or agents
  • Explicit control — you define formats, parsers, and retry logic; no hidden prompt engineering or magic
  • Simple by design — no memory systems, RAG pipelines, or high-level abstractions; just clean, predictable data generation

If you need agentic workflows, memory, or RAG, use LangChain. If you need to generate 10,000 structured examples with full control, use sintezi.

Features

  • Type-safe — Pydantic models for requests and responses with full type hints
  • Multiple formats — JSON, XML, plain text, or custom formatters
  • Smart retry — Separate retry policies for network errors and validation failures
  • Auto-parsing — Automatic format selection based on Pydantic models
  • LLM-agnostic — Works with any OpenAI-compatible API

Installation

pip install sintezi

Requirements: Python 3.11+

Quick start

from pydantic import BaseModel
from openai import AsyncOpenAI
from sintezi.ai.context import ai_context_from_openai
from sintezi.ai.executor import StructuredAiCall, StructuredAiCallConfig, AiCallParameters
from sintezi.ai.formatter import auto_formatter_for_type
from sintezi.ai.parser import auto_parser_for_type

class ProductInfo(BaseModel):
    name: str
    category: str

class ProductDescription(BaseModel):
    description: str

# Setup
client = AsyncOpenAI(api_key="your-api-key")
ctx = ai_context_from_openai(client)

config = StructuredAiCallConfig(
    system_message="Generate product descriptions.",
    parameters=AiCallParameters(model="gpt-4o-mini"),
)

ai_call = StructuredAiCall(
    ctx=ctx,
    config=config,
    formatter=auto_formatter_for_type(ProductInfo),
    parser=auto_parser_for_type(ProductDescription),
)

# Generate
product = ProductInfo(name="Laptop", category="Electronics")
result = await ai_call(product)
print(result.description)

See the quick start guide for a complete walkthrough.

Documentation

Full documentation: https://mrapplexz.github.io/sintezi/

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sintezi-0.2.0.tar.gz (9.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sintezi-0.2.0-py3-none-any.whl (13.8 kB view details)

Uploaded Python 3

File details

Details for the file sintezi-0.2.0.tar.gz.

File metadata

  • Download URL: sintezi-0.2.0.tar.gz
  • Upload date:
  • Size: 9.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for sintezi-0.2.0.tar.gz
Algorithm Hash digest
SHA256 d90d5f2f2d7680f9882c39b802fe2beda37ed2a92c0df7ef6ecc7309c2b14d34
MD5 1e9151d0653fd489115c4b9da830c482
BLAKE2b-256 092f7e41f475da80c78a4634cafee9633b15f3e52e2b5e1b8a372a6b48e66871

See more details on using hashes here.

Provenance

The following attestation bundles were made for sintezi-0.2.0.tar.gz:

Publisher: release.yml on mrapplexz/sintezi

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sintezi-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: sintezi-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 13.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for sintezi-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 072bb2bd715563538171d85b50261da69f5e702b22f473203094401059ec3d23
MD5 5aacaf304d1b82d6dc7fb8c8f934f995
BLAKE2b-256 e0d83a3e672aeddfdc0b1a5c74caada58525ecca579a95343e6bb80fe97bd46c

See more details on using hashes here.

Provenance

The following attestation bundles were made for sintezi-0.2.0-py3-none-any.whl:

Publisher: release.yml on mrapplexz/sintezi

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page