Skip to main content

LangExtract provider plugin for Anthropic Claude

Project description

LangExtract Anthropic Provider

A provider plugin for LangExtract that integrates Anthropic's Claude API for robust, structured information extraction.

Python 3.10+ License: Apache 2.0 Code style: black

Features

  • Native Anthropic API: Uses the official anthropic Python SDK for Claude models.
  • Safe parameter handling: Whitelist filtering; unsupported params raise clear errors.
  • Concurrent batching: Parallel inference for multi-prompt workloads.
  • Schema-aware: Optional structured output mode (JSON) from LangExtract examples.
  • Modern packaging: pyproject.toml with Hatch; works well with uv.

Installation

Using UV (Recommended)

# Install UV if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install the package
uv add langextract-anthropic

Using pip

pip install langextract-anthropic

From Source

git clone <repository-url>
cd langextract-anthropic
uv sync

Quick Start

1. Set up Anthropic API credentials

export ANTHROPIC_API_KEY="your-api-key"

2. Use with LangExtract

import langextract as lx

# Define extraction examples
examples = [
    lx.ExampleData(
        text="John Smith works at Microsoft in Seattle.",
        extractions=[
            lx.ExtractionData(
                extraction_class="Person",
                attributes={"name": "John Smith"}
            ),
            lx.ExtractionData(
                extraction_class="Organization", 
                attributes={"name": "Microsoft"}
            ),
            lx.ExtractionData(
                extraction_class="Location",
                attributes={"name": "Seattle"}
            ),
        ],
    ),
]

# Extract information using Anthropic Claude
result = lx.extract(
    text_or_documents="Sarah Johnson is a data scientist at Google in Mountain View.",
    prompt_description="Extract people, organizations, and locations.",
    examples=examples,
    model_id="anthropic-claude-3-5-sonnet-latest",
    temperature=0.1,
    max_tokens=512,
)

print(result.extractions)

Supported Models

This provider supports all Anthropic Claude models:

  • claude-3-5-sonnet-latest (recommended)
  • claude-3-5-sonnet-20241022
  • claude-3-5-haiku-latest
  • claude-3-opus-latest
  • claude-3-sonnet-20240229
  • claude-3-haiku-20240307

Model ID Format

Use the anthropic- prefix or specify the model name directly:

  • anthropic-claude-3-5-sonnet-latest → Uses model: claude-3-5-sonnet-latest
  • anthropic-claude-3-opus-latest → Uses model: claude-3-opus-latest
  • claude-3-5-sonnet-latest → Uses model directly

Configuration Parameters

Core Parameters

Parameter Type Description Default
model_id str Model identifier "claude-3-5-sonnet-latest"
api_key str Anthropic API key ANTHROPIC_API_KEY env var
temperature float Controls randomness (0-1) None
max_workers int Parallel request workers 10

Anthropic API Parameters

Parameter Type Description Range
max_tokens int Maximum tokens to generate 1-8192
temperature float Sampling temperature 0.0-1.0
top_p float Nucleus sampling 0.0-1.0
top_k int Top-k sampling 0-200
stop_sequences list[str] Stop sequences Max 4 items
metadata dict Request tracking metadata -

Usage Examples

# Basic extraction
result = lx.extract(
    text_or_documents=text,
    prompt_description=prompt,
    examples=examples,
    model_id="anthropic-claude-3-5-sonnet-latest",
)

# With custom parameters
result = lx.extract(
    text_or_documents=text,
    prompt_description=prompt,
    examples=examples,
    model_id="anthropic-claude-3-5-sonnet-latest",
    temperature=0.3,
    max_tokens=1000,
    top_p=0.9,
    stop_sequences=["END", "STOP"],
    metadata={"user_id": "user123"},
)

Environment Variables

Variable Description Required
ANTHROPIC_API_KEY Anthropic API key Yes

Development

Setup Development Environment

# Clone the repository
git clone <repository-url>
cd langextract-anthropic

# Install UV
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install dependencies
uv sync --dev

Running Tests

# Run unit tests (no API calls)
uv run pytest tests/ -m "unit"

# Run integration tests (requires ANTHROPIC_API_KEY)
uv run pytest tests/ -m "integration" 

# Run all tests with coverage
uv run pytest tests/ --cov=langextract_anthropic --cov-report=html

Development Commands

# Format code
uv run black langextract_anthropic tests
uv run isort langextract_anthropic tests

# Lint code
uv run ruff check langextract_anthropic tests
uv run mypy langextract_anthropic

# Build package
uv build

# Bump version
python scripts/bump_version.py patch  # 0.1.0 -> 0.1.1
python scripts/bump_version.py minor  # 0.1.0 -> 0.2.0
python scripts/bump_version.py major  # 0.1.0 -> 1.0.0

Testing

This provider includes comprehensive testing:

  • Unit tests: Mock-based testing of provider logic
  • Parameter tests: Validation of API parameter filtering
  • Integration tests: Real API testing (requires credentials)
# Set up test environment
export ANTHROPIC_API_KEY="your-api-key"

# Run specific test categories
uv run pytest tests/test_provider_unit.py -v
uv run pytest tests/test_parameter_filtering.py -v
uv run pytest tests/test_anthropic_integration.py -v  # requires API key

Error Handling

The provider provides clear error messages for common issues:

try:
    result = lx.extract(...)
except lx.exceptions.InferenceConfigError as e:
    # Configuration errors (missing API key, invalid params)
    print(f"Configuration error: {e}")
except lx.exceptions.InferenceRuntimeError as e:
    # Runtime errors (API failures, network issues)
    print(f"Runtime error: {e}")
    print(f"Original error: {e.original}")

Troubleshooting

Common Issues

  1. Missing API Key

    InferenceConfigError: Anthropic API key not provided
    

    Solution: Set ANTHROPIC_API_KEY environment variable or pass api_key parameter.

  2. Invalid Model Name

    AnthropicAPIError: model not found
    

    Solution: Use a valid Claude model name (see supported models above).

  3. Rate Limiting

    AnthropicAPIError: 429 Too Many Requests
    

    Solution: Reduce max_workers or add retry logic in your application.

  4. Token Limit Exceeded

    AnthropicAPIError: maximum context length exceeded
    

    Solution: Reduce input text length or increase max_tokens parameter.

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

Changelog

See CHANGELOG.md for a list of changes and version history.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langextract_anthropic-0.2.0.tar.gz (12.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

langextract_anthropic-0.2.0-py3-none-any.whl (13.7 kB view details)

Uploaded Python 3

File details

Details for the file langextract_anthropic-0.2.0.tar.gz.

File metadata

  • Download URL: langextract_anthropic-0.2.0.tar.gz
  • Upload date:
  • Size: 12.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for langextract_anthropic-0.2.0.tar.gz
Algorithm Hash digest
SHA256 7eabbe1d54936138190e09b4f2f475947bdf8597d8ac7c67d1a7919e5da887a5
MD5 6a327d15e6a2ef3c7273bc0ef1b21931
BLAKE2b-256 eb46dd69f1fbcf50eadb88590bb5710274d100fb18c18d7f1316f76941dec556

See more details on using hashes here.

Provenance

The following attestation bundles were made for langextract_anthropic-0.2.0.tar.gz:

Publisher: release.yml on Nobbettt/langextract-anthropic

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file langextract_anthropic-0.2.0-py3-none-any.whl.

File metadata

File hashes

Hashes for langextract_anthropic-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 aaa56c30bce8c8793df0e111fcd6ff9bcc3fb751471916b8734f549c2f9c7de1
MD5 e0748e36c0db33fc81cdbaa77f932c1f
BLAKE2b-256 3f48fc93294fdff8a7707eda4386613493b4cc23e363947659f1af0d429832af

See more details on using hashes here.

Provenance

The following attestation bundles were made for langextract_anthropic-0.2.0-py3-none-any.whl:

Publisher: release.yml on Nobbettt/langextract-anthropic

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page