Skip to main content

LangExtract provider plugin for Azure OpenAI

Project description

LangExtract Anthropic Provider

A provider plugin for LangExtract that integrates Anthropic's Claude API for robust, structured information extraction.

Python 3.10+ License: Apache 2.0 Code style: black

Features

  • Native Anthropic API: Uses the official anthropic Python SDK for Claude models.
  • Safe parameter handling: Whitelist filtering; unsupported params raise clear errors.
  • Concurrent batching: Parallel inference for multi-prompt workloads.
  • Schema-aware: Optional structured output mode (JSON) from LangExtract examples.
  • Modern packaging: pyproject.toml with Hatch; works well with uv.

Installation

Using UV (Recommended)

# Install UV if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install the package
uv add langextract-anthropic

Using pip

pip install langextract-anthropic

From Source

git clone <repository-url>
cd langextract-anthropic
uv sync

Quick Start

1. Set up Anthropic API credentials

export ANTHROPIC_API_KEY="your-api-key"

2. Use with LangExtract

import langextract as lx

# Define extraction examples
examples = [
    lx.ExampleData(
        text="John Smith works at Microsoft in Seattle.",
        extractions=[
            lx.ExtractionData(
                extraction_class="Person",
                attributes={"name": "John Smith"}
            ),
            lx.ExtractionData(
                extraction_class="Organization", 
                attributes={"name": "Microsoft"}
            ),
            lx.ExtractionData(
                extraction_class="Location",
                attributes={"name": "Seattle"}
            ),
        ],
    ),
]

# Extract information using Anthropic Claude
result = lx.extract(
    text_or_documents="Sarah Johnson is a data scientist at Google in Mountain View.",
    prompt_description="Extract people, organizations, and locations.",
    examples=examples,
    model_id="anthropic-claude-3-5-sonnet-latest",
    temperature=0.1,
    max_tokens=512,
)

print(result.extractions)

Supported Models

This provider supports all Anthropic Claude models:

  • claude-3-5-sonnet-latest (recommended)
  • claude-3-5-sonnet-20241022
  • claude-3-5-haiku-latest
  • claude-3-opus-latest
  • claude-3-sonnet-20240229
  • claude-3-haiku-20240307

Model ID Format

Use the anthropic- prefix or specify the model name directly:

  • anthropic-claude-3-5-sonnet-latest → Uses model: claude-3-5-sonnet-latest
  • anthropic-claude-3-opus-latest → Uses model: claude-3-opus-latest
  • claude-3-5-sonnet-latest → Uses model directly

Configuration Parameters

Core Parameters

Parameter Type Description Default
model_id str Model identifier "claude-3-5-sonnet-latest"
api_key str Anthropic API key ANTHROPIC_API_KEY env var
temperature float Controls randomness (0-1) None
max_workers int Parallel request workers 10

Anthropic API Parameters

Parameter Type Description Range
max_tokens int Maximum tokens to generate 1-8192
temperature float Sampling temperature 0.0-1.0
top_p float Nucleus sampling 0.0-1.0
top_k int Top-k sampling 0-200
stop_sequences list[str] Stop sequences Max 4 items
metadata dict Request tracking metadata -

Usage Examples

# Basic extraction
result = lx.extract(
    text_or_documents=text,
    prompt_description=prompt,
    examples=examples,
    model_id="anthropic-claude-3-5-sonnet-latest",
)

# With custom parameters
result = lx.extract(
    text_or_documents=text,
    prompt_description=prompt,
    examples=examples,
    model_id="anthropic-claude-3-5-sonnet-latest",
    temperature=0.3,
    max_tokens=1000,
    top_p=0.9,
    stop_sequences=["END", "STOP"],
    metadata={"user_id": "user123"},
)

Environment Variables

Variable Description Required
ANTHROPIC_API_KEY Anthropic API key Yes

Development

Setup Development Environment

# Clone the repository
git clone <repository-url>
cd langextract-anthropic

# Install UV
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install dependencies
uv sync --dev

Running Tests

# Run unit tests (no API calls)
uv run pytest tests/ -m "unit"

# Run integration tests (requires ANTHROPIC_API_KEY)
uv run pytest tests/ -m "integration" 

# Run all tests with coverage
uv run pytest tests/ --cov=langextract_anthropic --cov-report=html

Development Commands

# Format code
uv run black langextract_anthropic tests
uv run isort langextract_anthropic tests

# Lint code
uv run ruff check langextract_anthropic tests
uv run mypy langextract_anthropic

# Build package
uv build

# Bump version
python scripts/bump_version.py patch  # 0.1.0 -> 0.1.1
python scripts/bump_version.py minor  # 0.1.0 -> 0.2.0
python scripts/bump_version.py major  # 0.1.0 -> 1.0.0

Testing

This provider includes comprehensive testing:

  • Unit tests: Mock-based testing of provider logic
  • Parameter tests: Validation of API parameter filtering
  • Integration tests: Real API testing (requires credentials)
# Set up test environment
export ANTHROPIC_API_KEY="your-api-key"

# Run specific test categories
uv run pytest tests/test_provider_unit.py -v
uv run pytest tests/test_parameter_filtering.py -v
uv run pytest tests/test_anthropic_integration.py -v  # requires API key

Error Handling

The provider provides clear error messages for common issues:

try:
    result = lx.extract(...)
except lx.exceptions.InferenceConfigError as e:
    # Configuration errors (missing API key, invalid params)
    print(f"Configuration error: {e}")
except lx.exceptions.InferenceRuntimeError as e:
    # Runtime errors (API failures, network issues)
    print(f"Runtime error: {e}")
    print(f"Original error: {e.original}")

Troubleshooting

Common Issues

  1. Missing API Key

    InferenceConfigError: Anthropic API key not provided
    

    Solution: Set ANTHROPIC_API_KEY environment variable or pass api_key parameter.

  2. Invalid Model Name

    AnthropicAPIError: model not found
    

    Solution: Use a valid Claude model name (see supported models above).

  3. Rate Limiting

    AnthropicAPIError: 429 Too Many Requests
    

    Solution: Reduce max_workers or add retry logic in your application.

  4. Token Limit Exceeded

    AnthropicAPIError: maximum context length exceeded
    

    Solution: Reduce input text length or increase max_tokens parameter.

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

Changelog

See CHANGELOG.md for a list of changes and version history.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langextract_azureopenai-0.1.6.tar.gz (13.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

langextract_azureopenai-0.1.6-py3-none-any.whl (13.9 kB view details)

Uploaded Python 3

File details

Details for the file langextract_azureopenai-0.1.6.tar.gz.

File metadata

  • Download URL: langextract_azureopenai-0.1.6.tar.gz
  • Upload date:
  • Size: 13.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for langextract_azureopenai-0.1.6.tar.gz
Algorithm Hash digest
SHA256 ee658844e3c65a6214d054adf6150df18976b2afb8568fd4a0918affacfc7083
MD5 dbbf3238e0a132f4f7fe8424b69ea9b4
BLAKE2b-256 eb5bad6e174cc6b51d359068c066009c261b4695f16358b9ec6eebbe6b72da9f

See more details on using hashes here.

Provenance

The following attestation bundles were made for langextract_azureopenai-0.1.6.tar.gz:

Publisher: release.yml on Nobbettt/langextract-azureopenai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file langextract_azureopenai-0.1.6-py3-none-any.whl.

File metadata

File hashes

Hashes for langextract_azureopenai-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 2bc311a2f71a58311c2244315de092d8b4b226993bf4cfea0396d6beefa6cb3d
MD5 31eb92ab6fbdb48a6e7e44ea55f27a1f
BLAKE2b-256 c4b63ffbfa8930fc1a4e10398e135ef14792b4e8fa2b01a6829a79aba6cf2418

See more details on using hashes here.

Provenance

The following attestation bundles were made for langextract_azureopenai-0.1.6-py3-none-any.whl:

Publisher: release.yml on Nobbettt/langextract-azureopenai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page