Skip to main content

LangExtract provider plugin for Anthropic Claude

Project description

LangExtract Anthropic Provider

A provider plugin for LangExtract that integrates Anthropic's Claude API for robust, structured information extraction.

Python 3.10+ License: Apache 2.0 Code style: black

Features

  • Native Anthropic API: Uses the official anthropic Python SDK for Claude models.
  • Safe parameter handling: Whitelist filtering; unsupported params raise clear errors.
  • Concurrent batching: Parallel inference for multi-prompt workloads.
  • Schema-aware: Optional structured output mode (JSON) from LangExtract examples.
  • Modern packaging: pyproject.toml with Hatch; works well with uv.

Installation

Using UV (Recommended)

# Install UV if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install the package
uv add langextract-anthropic

Using pip

pip install langextract-anthropic

From Source

git clone <repository-url>
cd langextract-anthropic
uv sync

Quick Start

1. Set up Anthropic API credentials

export ANTHROPIC_API_KEY="your-api-key"

2. Use with LangExtract

import langextract as lx

# Define extraction examples
examples = [
    lx.ExampleData(
        text="John Smith works at Microsoft in Seattle.",
        extractions=[
            lx.ExtractionData(
                extraction_class="Person",
                attributes={"name": "John Smith"}
            ),
            lx.ExtractionData(
                extraction_class="Organization", 
                attributes={"name": "Microsoft"}
            ),
            lx.ExtractionData(
                extraction_class="Location",
                attributes={"name": "Seattle"}
            ),
        ],
    ),
]

# Extract information using Anthropic Claude
result = lx.extract(
    text_or_documents="Sarah Johnson is a data scientist at Google in Mountain View.",
    prompt_description="Extract people, organizations, and locations.",
    examples=examples,
    model_id="anthropic-claude-3-5-sonnet-latest",
    temperature=0.1,
    max_tokens=512,
)

print(result.extractions)

Supported Models

This provider supports all Anthropic Claude models:

  • claude-3-5-sonnet-latest (recommended)
  • claude-3-5-sonnet-20241022
  • claude-3-5-haiku-latest
  • claude-3-opus-latest
  • claude-3-sonnet-20240229
  • claude-3-haiku-20240307

Model ID Format

Use the anthropic- prefix or specify the model name directly:

  • anthropic-claude-3-5-sonnet-latest → Uses model: claude-3-5-sonnet-latest
  • anthropic-claude-3-opus-latest → Uses model: claude-3-opus-latest
  • claude-3-5-sonnet-latest → Uses model directly

Configuration Parameters

Core Parameters

Parameter Type Description Default
model_id str Model identifier "claude-3-5-sonnet-latest"
api_key str Anthropic API key ANTHROPIC_API_KEY env var
temperature float Controls randomness (0-1) None
max_workers int Parallel request workers 10

Anthropic API Parameters

Parameter Type Description Range
max_tokens int Maximum tokens to generate 1-8192
temperature float Sampling temperature 0.0-1.0
top_p float Nucleus sampling 0.0-1.0
top_k int Top-k sampling 0-200
stop_sequences list[str] Stop sequences Max 4 items
metadata dict Request tracking metadata -

Usage Examples

# Basic extraction
result = lx.extract(
    text_or_documents=text,
    prompt_description=prompt,
    examples=examples,
    model_id="anthropic-claude-3-5-sonnet-latest",
)

# With custom parameters
result = lx.extract(
    text_or_documents=text,
    prompt_description=prompt,
    examples=examples,
    model_id="anthropic-claude-3-5-sonnet-latest",
    temperature=0.3,
    max_tokens=1000,
    top_p=0.9,
    stop_sequences=["END", "STOP"],
    metadata={"user_id": "user123"},
)

Environment Variables

Variable Description Required
ANTHROPIC_API_KEY Anthropic API key Yes

Development

Setup Development Environment

# Clone the repository
git clone <repository-url>
cd langextract-anthropic

# Install UV
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install dependencies
uv sync --dev

Running Tests

# Run unit tests (no API calls)
uv run pytest tests/ -m "unit"

# Run integration tests (requires ANTHROPIC_API_KEY)
uv run pytest tests/ -m "integration" 

# Run all tests with coverage
uv run pytest tests/ --cov=langextract_anthropic --cov-report=html

Development Commands

# Format code
uv run black langextract_anthropic tests
uv run isort langextract_anthropic tests

# Lint code
uv run ruff check langextract_anthropic tests
uv run mypy langextract_anthropic

# Build package
uv build

# Bump version
python scripts/bump_version.py patch  # 0.1.0 -> 0.1.1
python scripts/bump_version.py minor  # 0.1.0 -> 0.2.0
python scripts/bump_version.py major  # 0.1.0 -> 1.0.0

Testing

This provider includes comprehensive testing:

  • Unit tests: Mock-based testing of provider logic
  • Parameter tests: Validation of API parameter filtering
  • Integration tests: Real API testing (requires credentials)
# Set up test environment
export ANTHROPIC_API_KEY="your-api-key"

# Run specific test categories
uv run pytest tests/test_provider_unit.py -v
uv run pytest tests/test_parameter_filtering.py -v
uv run pytest tests/test_anthropic_integration.py -v  # requires API key

Error Handling

The provider provides clear error messages for common issues:

try:
    result = lx.extract(...)
except lx.exceptions.InferenceConfigError as e:
    # Configuration errors (missing API key, invalid params)
    print(f"Configuration error: {e}")
except lx.exceptions.InferenceRuntimeError as e:
    # Runtime errors (API failures, network issues)
    print(f"Runtime error: {e}")
    print(f"Original error: {e.original}")

Troubleshooting

Common Issues

  1. Missing API Key

    InferenceConfigError: Anthropic API key not provided
    

    Solution: Set ANTHROPIC_API_KEY environment variable or pass api_key parameter.

  2. Invalid Model Name

    AnthropicAPIError: model not found
    

    Solution: Use a valid Claude model name (see supported models above).

  3. Rate Limiting

    AnthropicAPIError: 429 Too Many Requests
    

    Solution: Reduce max_workers or add retry logic in your application.

  4. Token Limit Exceeded

    AnthropicAPIError: maximum context length exceeded
    

    Solution: Reduce input text length or increase max_tokens parameter.

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

Changelog

See CHANGELOG.md for a list of changes and version history.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langextract_anthropic-0.2.1.tar.gz (13.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

langextract_anthropic-0.2.1-py3-none-any.whl (13.7 kB view details)

Uploaded Python 3

File details

Details for the file langextract_anthropic-0.2.1.tar.gz.

File metadata

  • Download URL: langextract_anthropic-0.2.1.tar.gz
  • Upload date:
  • Size: 13.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for langextract_anthropic-0.2.1.tar.gz
Algorithm Hash digest
SHA256 73c76fe9920bcd855a0498a629b6b7758cc804966a316d017925b7a38527f33c
MD5 523d3fa5490449963f60675ec3b465c3
BLAKE2b-256 2f177588ef960c595cf101a1ecf79bf87ccbc7697350326582d38e3ca85ce9c4

See more details on using hashes here.

Provenance

The following attestation bundles were made for langextract_anthropic-0.2.1.tar.gz:

Publisher: release.yml on Nobbettt/langextract-anthropic

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file langextract_anthropic-0.2.1-py3-none-any.whl.

File metadata

File hashes

Hashes for langextract_anthropic-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 514f22d644029c757859c81fbaab5893b9b58c097257a8e93488df3e0f23f940
MD5 4d95e6719db972bc4783ebc2ac51d886
BLAKE2b-256 414a945fb9b089a8ed90a8f63a733901921ba829edb62bdef707a620c9be5b4c

See more details on using hashes here.

Provenance

The following attestation bundles were made for langextract_anthropic-0.2.1-py3-none-any.whl:

Publisher: release.yml on Nobbettt/langextract-anthropic

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page