Skip to main content

Batch processing for Anthropic's Claude API with structured output

Project description

AI Batch

Python SDK for batch processing with structured output and citation mapping.

  • 50% cost savings via Anthropic's batch API pricing
  • Structured output with Pydantic models
  • Field-level citations map results to source documents
  • Type safety with full validation

Currently supports Anthropic Claude. OpenAI support coming soon.

Installation

pip install ai-batch

Quick Start

from ai_batch import batch_files
from pydantic import BaseModel

class Invoice(BaseModel):
    company_name: str
    total_amount: str
    date: str

# Process PDFs with structured output + citations
job = batch_files(
    files=["invoice1.pdf", "invoice2.pdf", "invoice3.pdf"],
    prompt="Extract the company name, total amount, and date.",
    model="claude-3-5-sonnet-20241022",
    response_model=Invoice,
    enable_citations=True
)

results = job.results()
citations = job.citations()

Output

Structured Results:

[
  Invoice(company_name="TechCorp Solutions Inc.", total_amount="$12,500.00", date="March 15, 2024"),
  Invoice(company_name="DataFlow Systems", total_amount="$8,750.00", date="March 18, 2024")
]

Field-Level Citations:

[
  {
    "company_name": [Citation(cited_text="TechCorp Solutions Inc.", start_page=1)],
    "total_amount": [Citation(cited_text="TOTAL: $12,500.00", start_page=2)],
    "date": [Citation(cited_text="Date: March 15, 2024", start_page=1)]
  },
  # ... one dict per result
]

Four Modes

Response Model Citations Returns
List of strings
List of Pydantic models
List of strings + flat citation list
List of Pydantic models + field citation dicts
# Mode 1: Text only
job = batch_files(files=["doc.pdf"], prompt="Summarize this")

# Mode 2: Structured only  
job = batch_files(files=["doc.pdf"], prompt="Extract data", response_model=MyModel)

# Mode 3: Text with citations
job = batch_files(files=["doc.pdf"], prompt="Analyze this", enable_citations=True)

# Mode 4: Structured with field citations
job = batch_files(files=["doc.pdf"], prompt="Extract data", 
                  response_model=MyModel, enable_citations=True)

Message Processing

For direct message processing:

from ai_batch import batch

messages = [
    [{"role": "user", "content": "Is this spam? You've won $1000!"}],
    [{"role": "user", "content": "Meeting at 3pm tomorrow"}],
]

job = batch(
    messages=messages,
    model="claude-3-haiku-20240307",
    response_model=SpamResult
)

results = job.results()

Setup

export ANTHROPIC_API_KEY="your-api-key"

Examples

  • examples/citation_example.py - Basic citation usage
  • examples/citation_with_pydantic.py - Structured output with citations
  • examples/spam_detection.py - Email classification
  • examples/pdf_extraction.py - PDF processing

Limitations

  • Citations only work with flat Pydantic models (no nested models)
  • PDFs require Sonnet models for best results
  • Batch jobs are asynchronous - call job.results() when ready

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ai_batch-0.0.1.tar.gz (43.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ai_batch-0.0.1-py3-none-any.whl (13.4 kB view details)

Uploaded Python 3

File details

Details for the file ai_batch-0.0.1.tar.gz.

File metadata

  • Download URL: ai_batch-0.0.1.tar.gz
  • Upload date:
  • Size: 43.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ai_batch-0.0.1.tar.gz
Algorithm Hash digest
SHA256 c0da7ee764e447392c65f44bcf63e9da229ed4ba787e1aab70c12efae6532daf
MD5 7968b21b3ce99b3a614ba53ef756fc4e
BLAKE2b-256 7b45503c8d385c252dc6ea7a7969c1a9223b9b27f6e6ba3b975bae4c0c2769be

See more details on using hashes here.

Provenance

The following attestation bundles were made for ai_batch-0.0.1.tar.gz:

Publisher: publish.yml on agamm/ai-batch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file ai_batch-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: ai_batch-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 13.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for ai_batch-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 56fd8a3bf5c834552b9319b83b80a9c27a38c6b4cd312dce7972d87aabb83a8e
MD5 c3317039ccf2a97610d2b431f03aa286
BLAKE2b-256 c337f33ccfde46e22f7a7c7a83ac1e440f1fae23f235996406a186f65d92a487

See more details on using hashes here.

Provenance

The following attestation bundles were made for ai_batch-0.0.1-py3-none-any.whl:

Publisher: publish.yml on agamm/ai-batch

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page