LangExtract provider plugin for Anthropic Claude
Project description
LangExtract Anthropic Provider
A provider plugin for LangExtract that integrates Anthropic's Claude API for robust, structured information extraction.
Features
- Native Anthropic API: Uses the official
anthropicPython SDK for Claude models. - Safe parameter handling: Whitelist filtering; unsupported params raise clear errors.
- Concurrent batching: Parallel inference for multi-prompt workloads.
- Schema-aware: Optional structured output mode (JSON) from LangExtract examples.
- Modern packaging:
pyproject.tomlwith Hatch; works well withuv.
Installation
Using UV (Recommended)
# Install UV if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install the package
uv add langextract-anthropic
Using pip
pip install langextract-anthropic
From Source
git clone <repository-url>
cd langextract-anthropic
uv sync
Quick Start
1. Set up Anthropic API credentials
export ANTHROPIC_API_KEY="your-api-key"
2. Use with LangExtract
import langextract as lx
# Define extraction examples
examples = [
lx.ExampleData(
text="John Smith works at Microsoft in Seattle.",
extractions=[
lx.ExtractionData(
extraction_class="Person",
attributes={"name": "John Smith"}
),
lx.ExtractionData(
extraction_class="Organization",
attributes={"name": "Microsoft"}
),
lx.ExtractionData(
extraction_class="Location",
attributes={"name": "Seattle"}
),
],
),
]
# Extract information using Anthropic Claude
result = lx.extract(
text_or_documents="Sarah Johnson is a data scientist at Google in Mountain View.",
prompt_description="Extract people, organizations, and locations.",
examples=examples,
model_id="anthropic-claude-3-5-sonnet-latest",
temperature=0.1,
max_tokens=512,
)
print(result.extractions)
Supported Models
This provider supports all Anthropic Claude models:
claude-3-5-sonnet-latest(recommended)claude-3-5-sonnet-20241022claude-3-5-haiku-latestclaude-3-opus-latestclaude-3-sonnet-20240229claude-3-haiku-20240307
Model ID Format
Use the anthropic- prefix or specify the model name directly:
anthropic-claude-3-5-sonnet-latest→ Uses model:claude-3-5-sonnet-latestanthropic-claude-3-opus-latest→ Uses model:claude-3-opus-latestclaude-3-5-sonnet-latest→ Uses model directly
Configuration Parameters
Core Parameters
| Parameter | Type | Description | Default |
|---|---|---|---|
model_id |
str |
Model identifier | "claude-3-5-sonnet-latest" |
api_key |
str |
Anthropic API key | ANTHROPIC_API_KEY env var |
temperature |
float |
Controls randomness (0-1) | None |
max_workers |
int |
Parallel request workers | 10 |
Anthropic API Parameters
| Parameter | Type | Description | Range |
|---|---|---|---|
max_tokens |
int |
Maximum tokens to generate | 1-8192 |
temperature |
float |
Sampling temperature | 0.0-1.0 |
top_p |
float |
Nucleus sampling | 0.0-1.0 |
top_k |
int |
Top-k sampling | 0-200 |
stop_sequences |
list[str] |
Stop sequences | Max 4 items |
metadata |
dict |
Request tracking metadata | - |
Usage Examples
# Basic extraction
result = lx.extract(
text_or_documents=text,
prompt_description=prompt,
examples=examples,
model_id="anthropic-claude-3-5-sonnet-latest",
)
# With custom parameters
result = lx.extract(
text_or_documents=text,
prompt_description=prompt,
examples=examples,
model_id="anthropic-claude-3-5-sonnet-latest",
temperature=0.3,
max_tokens=1000,
top_p=0.9,
stop_sequences=["END", "STOP"],
metadata={"user_id": "user123"},
)
Environment Variables
| Variable | Description | Required |
|---|---|---|
ANTHROPIC_API_KEY |
Anthropic API key | Yes |
Development
Setup Development Environment
# Clone the repository
git clone <repository-url>
cd langextract-anthropic
# Install UV
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install dependencies
uv sync --dev
Running Tests
# Run unit tests (no API calls)
uv run pytest tests/ -m "unit"
# Run integration tests (requires ANTHROPIC_API_KEY)
uv run pytest tests/ -m "integration"
# Run all tests with coverage
uv run pytest tests/ --cov=langextract_anthropic --cov-report=html
Development Commands
# Format code
uv run black langextract_anthropic tests
uv run isort langextract_anthropic tests
# Lint code
uv run ruff check langextract_anthropic tests
uv run mypy langextract_anthropic
# Build package
uv build
# Bump version
python scripts/bump_version.py patch # 0.1.0 -> 0.1.1
python scripts/bump_version.py minor # 0.1.0 -> 0.2.0
python scripts/bump_version.py major # 0.1.0 -> 1.0.0
Testing
This provider includes comprehensive testing:
- Unit tests: Mock-based testing of provider logic
- Parameter tests: Validation of API parameter filtering
- Integration tests: Real API testing (requires credentials)
# Set up test environment
export ANTHROPIC_API_KEY="your-api-key"
# Run specific test categories
uv run pytest tests/test_provider_unit.py -v
uv run pytest tests/test_parameter_filtering.py -v
uv run pytest tests/test_anthropic_integration.py -v # requires API key
Error Handling
The provider provides clear error messages for common issues:
try:
result = lx.extract(...)
except lx.exceptions.InferenceConfigError as e:
# Configuration errors (missing API key, invalid params)
print(f"Configuration error: {e}")
except lx.exceptions.InferenceRuntimeError as e:
# Runtime errors (API failures, network issues)
print(f"Runtime error: {e}")
print(f"Original error: {e.original}")
Troubleshooting
Common Issues
-
Missing API Key
InferenceConfigError: Anthropic API key not providedSolution: Set
ANTHROPIC_API_KEYenvironment variable or passapi_keyparameter. -
Invalid Model Name
AnthropicAPIError: model not foundSolution: Use a valid Claude model name (see supported models above).
-
Rate Limiting
AnthropicAPIError: 429 Too Many RequestsSolution: Reduce
max_workersor add retry logic in your application. -
Token Limit Exceeded
AnthropicAPIError: maximum context length exceededSolution: Reduce input text length or increase
max_tokensparameter.
License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
Changelog
See CHANGELOG.md for a list of changes and version history.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file langextract_anthropic-0.2.1.tar.gz.
File metadata
- Download URL: langextract_anthropic-0.2.1.tar.gz
- Upload date:
- Size: 13.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
73c76fe9920bcd855a0498a629b6b7758cc804966a316d017925b7a38527f33c
|
|
| MD5 |
523d3fa5490449963f60675ec3b465c3
|
|
| BLAKE2b-256 |
2f177588ef960c595cf101a1ecf79bf87ccbc7697350326582d38e3ca85ce9c4
|
Provenance
The following attestation bundles were made for langextract_anthropic-0.2.1.tar.gz:
Publisher:
release.yml on Nobbettt/langextract-anthropic
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
langextract_anthropic-0.2.1.tar.gz -
Subject digest:
73c76fe9920bcd855a0498a629b6b7758cc804966a316d017925b7a38527f33c - Sigstore transparency entry: 416352175
- Sigstore integration time:
-
Permalink:
Nobbettt/langextract-anthropic@a973de553085be3eadab5d163237286be351fc83 -
Branch / Tag:
refs/tags/v0.2.1 - Owner: https://github.com/Nobbettt
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a973de553085be3eadab5d163237286be351fc83 -
Trigger Event:
push
-
Statement type:
File details
Details for the file langextract_anthropic-0.2.1-py3-none-any.whl.
File metadata
- Download URL: langextract_anthropic-0.2.1-py3-none-any.whl
- Upload date:
- Size: 13.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
514f22d644029c757859c81fbaab5893b9b58c097257a8e93488df3e0f23f940
|
|
| MD5 |
4d95e6719db972bc4783ebc2ac51d886
|
|
| BLAKE2b-256 |
414a945fb9b089a8ed90a8f63a733901921ba829edb62bdef707a620c9be5b4c
|
Provenance
The following attestation bundles were made for langextract_anthropic-0.2.1-py3-none-any.whl:
Publisher:
release.yml on Nobbettt/langextract-anthropic
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
langextract_anthropic-0.2.1-py3-none-any.whl -
Subject digest:
514f22d644029c757859c81fbaab5893b9b58c097257a8e93488df3e0f23f940 - Sigstore transparency entry: 416352217
- Sigstore integration time:
-
Permalink:
Nobbettt/langextract-anthropic@a973de553085be3eadab5d163237286be351fc83 -
Branch / Tag:
refs/tags/v0.2.1 - Owner: https://github.com/Nobbettt
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@a973de553085be3eadab5d163237286be351fc83 -
Trigger Event:
push
-
Statement type: