SAIDEX — Structured AI Data EXtraction: validated Pydantic models from LLM responses with automatic retry, fallback, and an agentic tool loop
Project description
SAIDEX — Structured AI Data Extraction
Extract validated Pydantic models from LLM responses — with automatic retry, field-level error feedback, an optional fallback model, and an agentic tool loop.
Works with any LangChain-compatible model: GPT-4o, Claude, Gemini, and local models via Ollama or vLLM. No tool-calling support required — use ExtractionMode.JSON for any chat model.
Installation
pip install saidex
With automatic OpenAI rate-limit handling:
pip install "saidex[openai]"
Quick start
import asyncio
from pydantic import BaseModel, Field
from langchain_openai import ChatOpenAI
from saidex import extract_from_text
class PersonInfo(BaseModel):
name: str
age: int
occupation: str
async def main():
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)
person, stats = await extract_from_text(
llm,
PersonInfo,
"Alice Müller, 34, works as a software engineer in Munich.",
)
print(person) # name='Alice Müller' age=34 occupation='software engineer'
print(stats.total_retries) # 0 — first attempt was valid
asyncio.run(main())
Key features
| Feature | Detail |
|---|---|
| Validated output | Pydantic validation with field-level error feedback fed back to the LLM for self-correction |
| Auto-retry | Configurable retries with exponential back-off for transient errors and rate limits |
| Fallback model | Pair a cheap primary with a powerful fallback — pay for the big model only when needed |
| No tool-calling required | ExtractionMode.JSON works with any chat model including local models via Ollama / vLLM |
| Agentic tool loop | Give the LLM your own tools (lookups, API calls) — it calls them freely, then delivers a validated final answer |
| Multimodal | Pass images via standard LangChain messages to any vision-capable model |
| LangChain-native | Plugs into any LangChain chat model; supports callbacks (Langfuse, LangSmith, …) |
Links
License
Copyright 2026 Martin Lauff. Licensed under the Apache License, Version 2.0.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file saidex-0.2.0.tar.gz.
File metadata
- Download URL: saidex-0.2.0.tar.gz
- Upload date:
- Size: 293.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7b881fb8ed40a7b7a493348c1d16ac19958400ec2e5155e7258b2be63a18e8c7
|
|
| MD5 |
b0c7491f543d1a821023074c72be1f54
|
|
| BLAKE2b-256 |
38925be27a594ea388cdcef789dce2079d3daf6baf8ef05f6318ca7e3d553007
|
Provenance
The following attestation bundles were made for saidex-0.2.0.tar.gz:
Publisher:
release.yml on mlauf-labs/SAIDEX
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
saidex-0.2.0.tar.gz -
Subject digest:
7b881fb8ed40a7b7a493348c1d16ac19958400ec2e5155e7258b2be63a18e8c7 - Sigstore transparency entry: 1809437083
- Sigstore integration time:
-
Permalink:
mlauf-labs/SAIDEX@922c1ba9128255cbad4682522999ceeb7f62a2b9 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/mlauf-labs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@922c1ba9128255cbad4682522999ceeb7f62a2b9 -
Trigger Event:
release
-
Statement type:
File details
Details for the file saidex-0.2.0-py3-none-any.whl.
File metadata
- Download URL: saidex-0.2.0-py3-none-any.whl
- Upload date:
- Size: 24.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c294e6c5b8ca0acfc8b89acc891faf4f9b9590e95ee06db68c53fdb68f97c26f
|
|
| MD5 |
f0e28a29bbf19b4ec2027516b9012dfa
|
|
| BLAKE2b-256 |
0e8ee50917b816def6e47acf219fb62d9cd3c29818957e028548001a25fe9ec6
|
Provenance
The following attestation bundles were made for saidex-0.2.0-py3-none-any.whl:
Publisher:
release.yml on mlauf-labs/SAIDEX
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
saidex-0.2.0-py3-none-any.whl -
Subject digest:
c294e6c5b8ca0acfc8b89acc891faf4f9b9590e95ee06db68c53fdb68f97c26f - Sigstore transparency entry: 1809437101
- Sigstore integration time:
-
Permalink:
mlauf-labs/SAIDEX@922c1ba9128255cbad4682522999ceeb7f62a2b9 -
Branch / Tag:
refs/tags/v0.2.0 - Owner: https://github.com/mlauf-labs
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@922c1ba9128255cbad4682522999ceeb7f62a2b9 -
Trigger Event:
release
-
Statement type: