Skip to main content

LangExtract provider plugin for Azure OpenAI

Project description

LangExtract Azure OpenAI Provider

A provider plugin for LangExtract that integrates the Azure OpenAI Chat Completions API for robust, structured information extraction.

Python 3.10+ License: Apache 2.0 Code style: black

Features

  • Native Azure OpenAI: Uses the official openai Python SDK with Azure endpoints.
  • Safe parameter handling: Whitelist filtering; unsupported params raise clear errors.
  • Concurrent batching: Parallel inference for multi-prompt workloads.
  • Schema-aware: Optional structured output mode (JSON mode) from LangExtract examples.
  • Modern packaging: pyproject.toml with Hatch; works well with uv.

Installation

Using UV (Recommended)

# Install UV if you haven't already
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install the package
uv add langextract-azureopenai

Using pip

pip install langextract-azureopenai

From Source

git clone <repository-url>
cd langextract-azureopenai
uv sync

Quick Start

1. Set up Azure OpenAI credentials

export AZURE_OPENAI_API_KEY="your-api-key"
export AZURE_OPENAI_ENDPOINT="https://your-endpoint.openai.azure.com/"
export AZURE_OPENAI_API_VERSION="2024-12-01-preview"  # or your current API version

2. Basic Usage

import os
import langextract as lx

# Explicit configuration is recommended
config = lx.factory.ModelConfig(
    model_id="azureopenai-gpt-4.1",  # your Azure deployment name
    provider="AzureOpenAILanguageModel",
    provider_kwargs={
        "api_key": os.getenv("AZURE_OPENAI_API_KEY"),
        "azure_endpoint": os.getenv("AZURE_OPENAI_ENDPOINT"),
        "api_version": os.getenv("AZURE_OPENAI_API_VERSION"),
    },
)

prompt = "Extract people, organizations, and locations from the text."
examples = [
    lx.data.ExampleData(
        text="John Smith works at Microsoft in Seattle.",
        extractions=[
            lx.data.Extraction(
                extraction_class="person",
                extraction_text="John Smith",
                attributes={"role": "employee"},
            ),
            lx.data.Extraction(
                extraction_class="organization",
                extraction_text="Microsoft",
                attributes={"type": "company"},
            ),
            lx.data.Extraction(
                extraction_class="location",
                extraction_text="Seattle",
                attributes={"type": "city"},
            ),
        ],
    )
]

result = lx.extract(
    text_or_documents="Sarah Johnson is the CEO of TechCorp in San Francisco.",
    prompt_description=prompt,
    examples=examples,
    config=config,
)

for e in result.extractions:
    print(e.extraction_class, "->", e.extraction_text, e.attributes)

3. Advanced Parameters

result = lx.extract(
    text_or_documents=input_text,
    prompt_description=prompt,
    examples=examples,
    config=config,                 # reuse the explicit configuration
    # Azure OpenAI generation params
    temperature=0.3,
    top_p=0.9,
    frequency_penalty=0.1,
    presence_penalty=0.1,
    seed=42,
    user="user-123",
    logprobs=True,
    top_logprobs=2,
)

Supported Model IDs

The provider handles model IDs with the pattern ^azureopenai:

  • azureopenai-gpt-4 → Uses deployment: gpt-4
  • azureopenai-gpt-35-turbo → Uses deployment: gpt-35-turbo
  • azureopenai-your-custom-deployment → Uses deployment: your-custom-deployment

You can also specify the deployment name explicitly:

result = lx.extract(
    # ... other parameters
    model_id="azureopenai-any-name",
    provider_kwargs={"deployment_name": "your-actual-deployment-name"}
)

Environment Variables

Variable Description Required
AZURE_OPENAI_API_KEY Azure OpenAI API key ✅ Yes
AZURE_OPENAI_ENDPOINT Azure OpenAI endpoint URL ✅ Yes
AZURE_OPENAI_API_VERSION Azure OpenAI API version (e.g., 2024-12-01-preview) ✅ Yes

Parameters

  • Supported: temperature, top_p, frequency_penalty, presence_penalty, stop, logprobs, top_logprobs, seed, user, logit_bias, and advanced response_format.
  • Unsupported (raises InferenceConfigError): stream, tools, tool_choice, parallel_tool_calls.

Notes:

  • When schema constraints are enabled via examples, the provider sets response_format={"type": "json_object"} to encourage valid JSON output. Strict JSON Schema mode is not enabled at this time.

Development

Prerequisites

# Install UV
curl -LsSf https://astral.sh/uv/install.sh | sh

# Verify installation  
uv --version

Setup

# Clone repository
git clone <repository-url>
cd langextract-azureopenai

# Install dependencies
uv sync

# Install development dependencies
uv sync --group dev

Testing

# Run unit tests
uv run pytest tests/ -v

# Run parameter filtering tests (no credentials required)
uv run python tests/test_parameter_filtering.py

# Run full Azure API tests (requires credentials)
export AZURE_OPENAI_API_VERSION="2024-12-01-preview"
uv run python tests/test_azure_parameters.py

# Run with coverage
uv run pytest tests/ --cov=langextract_azureopenai --cov-report=html

Code Quality

# Format code
uv run black .
uv run isort .

# Lint code  
uv run ruff check .

# Type checking
uv run mypy langextract_azureopenai

Building and Publishing

# Build package
uv build

# Check build
ls dist/

# Publish to PyPI (requires API token)
uv publish --token your-pypi-token

Version Management

# Bump version
python scripts/bump_version.py patch   # 0.1.0 -> 0.1.1
python scripts/bump_version.py minor   # 0.1.0 -> 0.2.0  
python scripts/bump_version.py major   # 0.1.0 -> 1.0.0

Developer Scripts

For a quick overview of the helper scripts (testing, releasing, and versioning), see:

  • scripts/README.md — summaries and usage for bump_version.py, run_tests.py, check_build.py, and release.py.

Examples

See the examples/ directory for complete usage examples:

  • examples/example_usage.py - Basic entity extraction
  • tests/test_azure_parameters.py - Parameter compatibility testing
  • tests/test_parameter_filtering.py - Security validation

Testing

The package includes a comprehensive test suite:

  • Unit Tests: Core functionality without API calls
  • Integration Tests: Real Azure OpenAI API testing
  • Security Tests: Parameter filtering validation
  • Performance Tests: Batch processing and concurrency

Run specific test categories:

# Unit tests only
uv run pytest tests/ -m "unit"

# Integration tests (requires credentials)
uv run pytest tests/ -m "integration"

Contributing

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature-name
  3. Make changes and add tests
  4. Run the test suite: uv run pytest
  5. Format code: uv run black . && uv run isort .
  6. Submit a pull request

Troubleshooting

Common Issues

ImportError after installation:

# Ensure package is properly installed
uv sync
uv run python -c "import langextract_azureopenai; print('✅ Import successful')"

Authentication errors:

# Verify credentials are set
echo $AZURE_OPENAI_API_KEY
echo $AZURE_OPENAI_ENDPOINT
echo $AZURE_OPENAI_API_VERSION

# Test credentials
uv run python tests/test_provider_basic.py

Parameter errors:

# Check parameter compatibility
uv run python tests/test_azure_parameters.py

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Links


Happy Extracting with Azure OpenAI! 🚀

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langextract_azureopenai-0.1.4.tar.gz (13.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

langextract_azureopenai-0.1.4-py3-none-any.whl (14.3 kB view details)

Uploaded Python 3

File details

Details for the file langextract_azureopenai-0.1.4.tar.gz.

File metadata

  • Download URL: langextract_azureopenai-0.1.4.tar.gz
  • Upload date:
  • Size: 13.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for langextract_azureopenai-0.1.4.tar.gz
Algorithm Hash digest
SHA256 2f86b3c9ae550619776a6efd7ec97bd320d069c92deffc82a8e7799e229e390f
MD5 7efeb6f9278c5b51cf798f993856d9b2
BLAKE2b-256 b58252f4eb7c36eaf6c363fcae6989ccf492aa64ef3138ff152932b5747d04a9

See more details on using hashes here.

Provenance

The following attestation bundles were made for langextract_azureopenai-0.1.4.tar.gz:

Publisher: release.yml on Nobbettt/langextract-azureopenai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file langextract_azureopenai-0.1.4-py3-none-any.whl.

File metadata

File hashes

Hashes for langextract_azureopenai-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 9b6b66a3070033bc2ce795f3eb430876be4dd47f2938f62132633272bcf615c4
MD5 2964ad1d04075dfbe3fda1b688962b78
BLAKE2b-256 a1e17c865876f75da4ce3d0e2c2f8e04dd63cafd9f860530905663a781c37fff

See more details on using hashes here.

Provenance

The following attestation bundles were made for langextract_azureopenai-0.1.4-py3-none-any.whl:

Publisher: release.yml on Nobbettt/langextract-azureopenai

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page