Batch processing for Anthropic's Claude API with structured output
Project description
AI Batch
Python SDK for batch processing with structured output and citation mapping.
- 50% cost savings via Anthropic's batch API pricing
- Structured output with Pydantic models
- Field-level citations map results to source documents
- Type safety with full validation
Currently supports Anthropic Claude. OpenAI support coming soon.
Installation
pip install ai-batch
Quick Start
from ai_batch import batch_files
from pydantic import BaseModel
class Invoice(BaseModel):
company_name: str
total_amount: str
date: str
# Process PDFs with structured output + citations
job = batch_files(
files=["invoice1.pdf", "invoice2.pdf", "invoice3.pdf"],
prompt="Extract the company name, total amount, and date.",
model="claude-3-5-sonnet-20241022",
response_model=Invoice,
enable_citations=True
)
results = job.results()
citations = job.citations()
Output
Structured Results:
[
Invoice(company_name="TechCorp Solutions Inc.", total_amount="$12,500.00", date="March 15, 2024"),
Invoice(company_name="DataFlow Systems", total_amount="$8,750.00", date="March 18, 2024")
]
Field-Level Citations:
[
{
"company_name": [Citation(cited_text="TechCorp Solutions Inc.", start_page=1)],
"total_amount": [Citation(cited_text="TOTAL: $12,500.00", start_page=2)],
"date": [Citation(cited_text="Date: March 15, 2024", start_page=1)]
},
# ... one dict per result
]
Four Modes
| Response Model | Citations | Returns |
|---|---|---|
| ❌ | ❌ | List of strings |
| ✅ | ❌ | List of Pydantic models |
| ❌ | ✅ | List of strings + flat citation list |
| ✅ | ✅ | List of Pydantic models + field citation dicts |
# Mode 1: Text only
job = batch_files(files=["doc.pdf"], prompt="Summarize this")
# Mode 2: Structured only
job = batch_files(files=["doc.pdf"], prompt="Extract data", response_model=MyModel)
# Mode 3: Text with citations
job = batch_files(files=["doc.pdf"], prompt="Analyze this", enable_citations=True)
# Mode 4: Structured with field citations
job = batch_files(files=["doc.pdf"], prompt="Extract data",
response_model=MyModel, enable_citations=True)
Message Processing
For direct message processing:
from ai_batch import batch
messages = [
[{"role": "user", "content": "Is this spam? You've won $1000!"}],
[{"role": "user", "content": "Meeting at 3pm tomorrow"}],
]
job = batch(
messages=messages,
model="claude-3-haiku-20240307",
response_model=SpamResult
)
results = job.results()
Setup
export ANTHROPIC_API_KEY="your-api-key"
Examples
examples/citation_example.py- Basic citation usageexamples/citation_with_pydantic.py- Structured output with citationsexamples/spam_detection.py- Email classificationexamples/pdf_extraction.py- PDF processing
Limitations
- Citations only work with flat Pydantic models (no nested models)
- PDFs require Sonnet models for best results
- Batch jobs are asynchronous - call
job.results()when ready
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ai_batch-0.0.1.tar.gz.
File metadata
- Download URL: ai_batch-0.0.1.tar.gz
- Upload date:
- Size: 43.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c0da7ee764e447392c65f44bcf63e9da229ed4ba787e1aab70c12efae6532daf
|
|
| MD5 |
7968b21b3ce99b3a614ba53ef756fc4e
|
|
| BLAKE2b-256 |
7b45503c8d385c252dc6ea7a7969c1a9223b9b27f6e6ba3b975bae4c0c2769be
|
Provenance
The following attestation bundles were made for ai_batch-0.0.1.tar.gz:
Publisher:
publish.yml on agamm/ai-batch
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ai_batch-0.0.1.tar.gz -
Subject digest:
c0da7ee764e447392c65f44bcf63e9da229ed4ba787e1aab70c12efae6532daf - Sigstore transparency entry: 268350133
- Sigstore integration time:
-
Permalink:
agamm/ai-batch@ddaa32c6e90de85ef44e41134f36aa47b0aef2d9 -
Branch / Tag:
refs/tags/v0.0.1 - Owner: https://github.com/agamm
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ddaa32c6e90de85ef44e41134f36aa47b0aef2d9 -
Trigger Event:
release
-
Statement type:
File details
Details for the file ai_batch-0.0.1-py3-none-any.whl.
File metadata
- Download URL: ai_batch-0.0.1-py3-none-any.whl
- Upload date:
- Size: 13.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
56fd8a3bf5c834552b9319b83b80a9c27a38c6b4cd312dce7972d87aabb83a8e
|
|
| MD5 |
c3317039ccf2a97610d2b431f03aa286
|
|
| BLAKE2b-256 |
c337f33ccfde46e22f7a7c7a83ac1e440f1fae23f235996406a186f65d92a487
|
Provenance
The following attestation bundles were made for ai_batch-0.0.1-py3-none-any.whl:
Publisher:
publish.yml on agamm/ai-batch
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ai_batch-0.0.1-py3-none-any.whl -
Subject digest:
56fd8a3bf5c834552b9319b83b80a9c27a38c6b4cd312dce7972d87aabb83a8e - Sigstore transparency entry: 268350137
- Sigstore integration time:
-
Permalink:
agamm/ai-batch@ddaa32c6e90de85ef44e41134f36aa47b0aef2d9 -
Branch / Tag:
refs/tags/v0.0.1 - Owner: https://github.com/agamm
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@ddaa32c6e90de85ef44e41134f36aa47b0aef2d9 -
Trigger Event:
release
-
Statement type: