LangExtract provider plugin for Azure OpenAI
Project description
LangExtract Azure OpenAI Provider
A provider plugin for LangExtract that integrates the Azure OpenAI Chat Completions API for robust, structured information extraction.
Features
- Native Azure OpenAI: Uses the official
openaiPython SDK with Azure endpoints. - Safe parameter handling: Whitelist filtering; unsupported params raise clear errors.
- Concurrent batching: Parallel inference for multi-prompt workloads.
- Schema-aware: Optional structured output mode (JSON mode) from LangExtract examples.
- Modern packaging:
pyproject.tomlwith Hatch; works well withuv.
Installation
Using UV (Recommended)
# Install UV if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install the package
uv add langextract-azureopenai
Using pip
pip install langextract-azureopenai
From Source
git clone <repository-url>
cd langextract-azureopenai
uv sync
Quick Start
1. Set up Azure OpenAI credentials
export AZURE_OPENAI_API_KEY="your-api-key"
export AZURE_OPENAI_ENDPOINT="https://your-endpoint.openai.azure.com/"
export AZURE_OPENAI_API_VERSION="2024-12-01-preview" # or your current API version
2. Basic Usage
import os
import langextract as lx
# Explicit configuration is recommended
config = lx.factory.ModelConfig(
model_id="azureopenai-gpt-4.1", # your Azure deployment name
provider="AzureOpenAILanguageModel",
provider_kwargs={
"api_key": os.getenv("AZURE_OPENAI_API_KEY"),
"azure_endpoint": os.getenv("AZURE_OPENAI_ENDPOINT"),
"api_version": os.getenv("AZURE_OPENAI_API_VERSION"),
},
)
prompt = "Extract people, organizations, and locations from the text."
examples = [
lx.data.ExampleData(
text="John Smith works at Microsoft in Seattle.",
extractions=[
lx.data.Extraction(
extraction_class="person",
extraction_text="John Smith",
attributes={"role": "employee"},
),
lx.data.Extraction(
extraction_class="organization",
extraction_text="Microsoft",
attributes={"type": "company"},
),
lx.data.Extraction(
extraction_class="location",
extraction_text="Seattle",
attributes={"type": "city"},
),
],
)
]
result = lx.extract(
text_or_documents="Sarah Johnson is the CEO of TechCorp in San Francisco.",
prompt_description=prompt,
examples=examples,
config=config,
)
for e in result.extractions:
print(e.extraction_class, "->", e.extraction_text, e.attributes)
3. Advanced Parameters
result = lx.extract(
text_or_documents=input_text,
prompt_description=prompt,
examples=examples,
config=config, # reuse the explicit configuration
# Azure OpenAI generation params
temperature=0.3,
top_p=0.9,
frequency_penalty=0.1,
presence_penalty=0.1,
seed=42,
user="user-123",
logprobs=True,
top_logprobs=2,
)
Supported Model IDs
The provider handles model IDs with the pattern ^azureopenai:
azureopenai-gpt-4→ Uses deployment:gpt-4azureopenai-gpt-35-turbo→ Uses deployment:gpt-35-turboazureopenai-your-custom-deployment→ Uses deployment:your-custom-deployment
You can also specify the deployment name explicitly:
result = lx.extract(
# ... other parameters
model_id="azureopenai-any-name",
provider_kwargs={"deployment_name": "your-actual-deployment-name"}
)
Environment Variables
| Variable | Description | Required |
|---|---|---|
AZURE_OPENAI_API_KEY |
Azure OpenAI API key | ✅ Yes |
AZURE_OPENAI_ENDPOINT |
Azure OpenAI endpoint URL | ✅ Yes |
AZURE_OPENAI_API_VERSION |
Azure OpenAI API version (e.g., 2024-12-01-preview) |
✅ Yes |
Parameters
- Supported:
temperature,top_p,frequency_penalty,presence_penalty,stop,logprobs,top_logprobs,seed,user,logit_bias, and advancedresponse_format. - Unsupported (raises
InferenceConfigError):stream,tools,tool_choice,parallel_tool_calls.
Notes:
- When schema constraints are enabled via examples, the provider sets
response_format={"type": "json_object"}to encourage valid JSON output. Strict JSON Schema mode is not enabled at this time.
Development
Prerequisites
# Install UV
curl -LsSf https://astral.sh/uv/install.sh | sh
# Verify installation
uv --version
Setup
# Clone repository
git clone <repository-url>
cd langextract-azureopenai
# Install dependencies
uv sync
# Install development dependencies
uv sync --group dev
Testing
# Run unit tests
uv run pytest tests/ -v
# Run parameter filtering tests (no credentials required)
uv run python tests/test_parameter_filtering.py
# Run full Azure API tests (requires credentials)
export AZURE_OPENAI_API_VERSION="2024-12-01-preview"
uv run python tests/test_azure_parameters.py
# Run with coverage
uv run pytest tests/ --cov=langextract_azureopenai --cov-report=html
Code Quality
# Format code
uv run black .
uv run isort .
# Lint code
uv run ruff check .
# Type checking
uv run mypy langextract_azureopenai
Building and Publishing
# Build package
uv build
# Check build
ls dist/
# Publish to PyPI (requires API token)
uv publish --token your-pypi-token
Version Management
# Bump version
python scripts/bump_version.py patch # 0.1.0 -> 0.1.1
python scripts/bump_version.py minor # 0.1.0 -> 0.2.0
python scripts/bump_version.py major # 0.1.0 -> 1.0.0
Developer Scripts
For a quick overview of the helper scripts (testing, releasing, and versioning), see:
scripts/README.md— summaries and usage forbump_version.py,run_tests.py,check_build.py, andrelease.py.
Examples
See the examples/ directory for complete usage examples:
examples/example_usage.py- Basic entity extractiontests/test_azure_parameters.py- Parameter compatibility testingtests/test_parameter_filtering.py- Security validation
Testing
The package includes a comprehensive test suite:
- Unit Tests: Core functionality without API calls
- Integration Tests: Real Azure OpenAI API testing
- Security Tests: Parameter filtering validation
- Performance Tests: Batch processing and concurrency
Run specific test categories:
# Unit tests only
uv run pytest tests/ -m "unit"
# Integration tests (requires credentials)
uv run pytest tests/ -m "integration"
Contributing
- Fork the repository
- Create a feature branch:
git checkout -b feature-name - Make changes and add tests
- Run the test suite:
uv run pytest - Format code:
uv run black . && uv run isort . - Submit a pull request
Troubleshooting
Common Issues
ImportError after installation:
# Ensure package is properly installed
uv sync
uv run python -c "import langextract_azureopenai; print('✅ Import successful')"
Authentication errors:
# Verify credentials are set
echo $AZURE_OPENAI_API_KEY
echo $AZURE_OPENAI_ENDPOINT
echo $AZURE_OPENAI_API_VERSION
# Test credentials
uv run python tests/test_provider_basic.py
Parameter errors:
# Check parameter compatibility
uv run python tests/test_azure_parameters.py
License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Links
Happy Extracting with Azure OpenAI! 🚀
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file langextract_azureopenai-0.1.4.tar.gz.
File metadata
- Download URL: langextract_azureopenai-0.1.4.tar.gz
- Upload date:
- Size: 13.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2f86b3c9ae550619776a6efd7ec97bd320d069c92deffc82a8e7799e229e390f
|
|
| MD5 |
7efeb6f9278c5b51cf798f993856d9b2
|
|
| BLAKE2b-256 |
b58252f4eb7c36eaf6c363fcae6989ccf492aa64ef3138ff152932b5747d04a9
|
Provenance
The following attestation bundles were made for langextract_azureopenai-0.1.4.tar.gz:
Publisher:
release.yml on Nobbettt/langextract-azureopenai
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
langextract_azureopenai-0.1.4.tar.gz -
Subject digest:
2f86b3c9ae550619776a6efd7ec97bd320d069c92deffc82a8e7799e229e390f - Sigstore transparency entry: 409502809
- Sigstore integration time:
-
Permalink:
Nobbettt/langextract-azureopenai@761c0806fe205897180e8e0b8b273164d399a516 -
Branch / Tag:
refs/tags/v0.1.5 - Owner: https://github.com/Nobbettt
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@761c0806fe205897180e8e0b8b273164d399a516 -
Trigger Event:
push
-
Statement type:
File details
Details for the file langextract_azureopenai-0.1.4-py3-none-any.whl.
File metadata
- Download URL: langextract_azureopenai-0.1.4-py3-none-any.whl
- Upload date:
- Size: 14.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9b6b66a3070033bc2ce795f3eb430876be4dd47f2938f62132633272bcf615c4
|
|
| MD5 |
2964ad1d04075dfbe3fda1b688962b78
|
|
| BLAKE2b-256 |
a1e17c865876f75da4ce3d0e2c2f8e04dd63cafd9f860530905663a781c37fff
|
Provenance
The following attestation bundles were made for langextract_azureopenai-0.1.4-py3-none-any.whl:
Publisher:
release.yml on Nobbettt/langextract-azureopenai
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
langextract_azureopenai-0.1.4-py3-none-any.whl -
Subject digest:
9b6b66a3070033bc2ce795f3eb430876be4dd47f2938f62132633272bcf615c4 - Sigstore transparency entry: 409502813
- Sigstore integration time:
-
Permalink:
Nobbettt/langextract-azureopenai@761c0806fe205897180e8e0b8b273164d399a516 -
Branch / Tag:
refs/tags/v0.1.5 - Owner: https://github.com/Nobbettt
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@761c0806fe205897180e8e0b8b273164d399a516 -
Trigger Event:
push
-
Statement type: