Skip to main content

LangCore provider plugin for LiteLLM

Project description

LangCore LiteLLM

Provider plugin for LangCore — access 100+ language models through a single, unified interface via LiteLLM.

PyPI version Python License


Overview

langcore-litellm is a provider plugin for LangCore that adds support for 100+ language models through LiteLLM's unified API. Install it, prefix your model ID with litellm/, and every LangCore extraction call routes through LiteLLM transparently — auto-discovered via Python entry points.


Features

  • 100+ model support — OpenAI, Anthropic, Google, Azure, Mistral, Groq, Cohere, HuggingFace, Ollama, vLLM, and more through a single provider
  • Native async — uses litellm.acompletion() with asyncio.Semaphore for true non-blocking concurrent I/O (no thread pool overhead)
  • Multi-pass cache bypass — automatic per-pass cache control ensures fresh LLM responses on repeat extraction passes while keeping the first pass cacheable
  • Token usage tracking — captures prompt, completion, and total token counts (UsageStats) from every inference call
  • Concurrency control — configurable max_workers semaphore limits parallel async requests to prevent rate-limit errors
  • Zero-config plugin — auto-registered via Python entry points; no manual wiring required
  • Full parameter passthrough — forward any LiteLLM-supported parameter (temperature, top_p, timeout, etc.) through provider_kwargs

Installation

pip install langcore-litellm

Or install from source:

git clone https://github.com/JustStas/langcore-litellm
cd langcore-litellm
pip install -e .

Quick Start

Integration with LangCore

langcore-litellm integrates with LangCore through the provider plugin system. Create a model configuration, and LangCore handles the rest:

import langcore as lx

# Configure the LiteLLM provider
config = lx.factory.ModelConfig(
    model_id="litellm/gpt-4o",
    provider="LiteLLMLanguageModel",
)
model = lx.factory.create_model(config)

# Use with any LangCore extraction
result = lx.extract(
    text_or_documents="Acme Corp agrees to pay $50,000 to Beta LLC by March 2025.",
    model=model,
    prompt_description="Extract parties, monetary amounts, and dates.",
    examples=[
        lx.data.ExampleData(
            text="Alpha Inc will pay $10,000 to Omega Ltd by January 2024.",
            extractions=[
                lx.data.Extraction("party", "Alpha Inc", attributes={"role": "payer"}),
                lx.data.Extraction("party", "Omega Ltd", attributes={"role": "payee"}),
                lx.data.Extraction("monetary_amount", "$10,000"),
                lx.data.Extraction("date", "January 2024", attributes={"type": "deadline"}),
            ],
        )
    ],
)

print(result)

Usage

Supported Models

Model IDs must be prefixed with litellm/ (or litellm-) to route through this provider:

Provider Example Model IDs
OpenAI litellm/gpt-4o, litellm/gpt-4o-mini, litellm/gpt-4-turbo
Anthropic litellm/claude-3-opus, litellm/claude-3.5-sonnet, litellm/claude-3-haiku
Google litellm/gemini-2.5-pro, litellm/gemini-2.0-flash
Azure OpenAI litellm/azure/your-deployment-name
Mistral litellm/mistral-large-latest
Groq litellm/groq/llama-3.1-70b
Ollama litellm/ollama/llama3.1
And 100+ more See LiteLLM providers

Environment Variables

Set the appropriate API key for your provider:

# OpenAI
export OPENAI_API_KEY="sk-..."

# Anthropic
export ANTHROPIC_API_KEY="sk-ant-..."

# Google (Gemini)
export GEMINI_API_KEY="..."

# Azure OpenAI
export AZURE_API_KEY="..."
export AZURE_API_BASE="https://your-resource.openai.azure.com/"
export AZURE_API_VERSION="2024-02-01"

# Ollama (local)
export OLLAMA_API_BASE="http://localhost:11434"

See the LiteLLM documentation for the full list of provider-specific variables.

Async Extraction

The provider uses litellm.acompletion() for native async, avoiding thread overhead:

import asyncio
import langcore as lx

config = lx.factory.ModelConfig(
    model_id="litellm/gpt-4o",
    provider="LiteLLMLanguageModel",
)
model = lx.factory.create_model(config)

async def main():
    result = await lx.async_extract(
        text_or_documents="Agreement between Acme Corp and Beta LLC...",
        model=model,
        prompt_description="Extract parties and obligations.",
        examples=[...],
    )
    print(result)

asyncio.run(main())

Multi-Pass Extraction

When running multiple extraction passes (extraction_passes > 1), the provider automatically manages cache behaviour:

Pass Cache Behaviour
1st pass Normal — may be served from cache
2nd pass Bypass — forces a fresh LLM response
3rd+ passes Bypass — forces a fresh LLM response

This is fully automatic. LangCore threads a pass_num argument to the provider, which injects cache={"no-cache": True} for passes ≥ 1:

result = lx.extract(
    text_or_documents="Contract text...",
    model=model,
    prompt_description="Extract all entities.",
    examples=[...],
    extraction_passes=3,  # 3 passes, only the first is cacheable
)

Advanced Configuration

Forward any LiteLLM parameter through provider_kwargs:

config = lx.factory.ModelConfig(
    model_id="litellm/gpt-4o",
    provider="LiteLLMLanguageModel",
    provider_kwargs={
        "temperature": 0.2,
        "max_tokens": 2000,
        "top_p": 0.9,
        "timeout": 60,
        "api_key": "sk-...",  # override env var
    },
)
model = lx.factory.create_model(config)

Reserved Parameters

These parameters are consumed internally and are not forwarded to LiteLLM:

Parameter Type Default Description
max_workers int 10 Maximum concurrent async requests (semaphore size)
pass_num int 0 Current extraction pass index — set automatically by LangCore during multi-pass extraction

Composing with Other Plugins

langcore-litellm serves as the base provider that other LangCore plugins wrap. Stack decorators to add audit logging, output validation, or hybrid rule-based extraction:

import langcore as lx
from langcore_audit import AuditLanguageModel, LoggingSink
from langcore_guardrails import GuardrailLanguageModel, SchemaValidator, OnFailAction

# Base LLM provider
config = lx.factory.ModelConfig(
    model_id="litellm/gpt-4o",
    provider="LiteLLMLanguageModel",
)
llm = lx.factory.create_model(config)

# Add output validation
guarded = GuardrailLanguageModel(
    model_id="guardrails/gpt-4o",
    inner=llm,
    validators=[SchemaValidator(MySchema, on_fail=OnFailAction.REASK)],
    max_retries=3,
)

# Add audit logging
audited = AuditLanguageModel(
    model_id="audit/gpt-4o",
    inner=guarded,
    sinks=[LoggingSink()],
)

# Use the full stack with LangCore
result = lx.extract(
    text_or_documents="Contract text...",
    model=audited,
    prompt_description="Extract entities.",
    examples=[...],
)

Output Format

Extractions return LangCore's standard AnnotatedDocument with precise character intervals:

AnnotatedDocument(
    extractions=[
        Extraction(
            extraction_class='party',
            extraction_text='Acme Corp',
            char_interval=CharInterval(start_pos=0, end_pos=9),
            alignment_status=<AlignmentStatus.MATCH_EXACT: 'match_exact'>,
            attributes={'role': 'payer'}
        ),
        Extraction(
            extraction_class='monetary_amount',
            extraction_text='$50,000',
            char_interval=CharInterval(start_pos=24, end_pos=31),
            alignment_status=<AlignmentStatus.MATCH_EXACT: 'match_exact'>,
            attributes={}
        ),
    ],
    text='Acme Corp agrees to pay $50,000 to Beta LLC by March 2025.'
)

Error Handling

API failures are captured gracefully — no unhandled exceptions:

ScoredOutput(score=0.0, output="LiteLLM API error: [error details]")

Development

pip install -e .            # Install in development mode
python test_plugin.py       # Run tests
python -m build             # Build package
twine upload dist/*         # Publish to PyPI

Requirements

  • Python ≥ 3.12
  • langcore
  • litellm ≥ 1.81.13

License

Apache License 2.0 — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langcore_litellm-1.0.5.tar.gz (21.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

langcore_litellm-1.0.5-py3-none-any.whl (14.3 kB view details)

Uploaded Python 3

File details

Details for the file langcore_litellm-1.0.5.tar.gz.

File metadata

  • Download URL: langcore_litellm-1.0.5.tar.gz
  • Upload date:
  • Size: 21.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for langcore_litellm-1.0.5.tar.gz
Algorithm Hash digest
SHA256 4397ccd6b2541dc4ec81f04ea3415cb711e7b697ae2aa56f44f319b93b253fbd
MD5 f24504acca3c0bb2bee7d71cfee65f99
BLAKE2b-256 9df7c45cd1dc38b8ad92855a067cfd72517e6fc8032dbd629d0677db9b1619c4

See more details on using hashes here.

Provenance

The following attestation bundles were made for langcore_litellm-1.0.5.tar.gz:

Publisher: release.yml on IgnatG/langcore-litellm

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file langcore_litellm-1.0.5-py3-none-any.whl.

File metadata

File hashes

Hashes for langcore_litellm-1.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 ad83813ac1c4d95bf057296e18c7b532f47a7ba4a242683bd82fea4a7b6bde0f
MD5 342671117871ffbc951dc6779fb55ac3
BLAKE2b-256 f26e2ab4b894398b773923b589e0770d05f6ce53cbc9ec4e2caf2d6378bb16f6

See more details on using hashes here.

Provenance

The following attestation bundles were made for langcore_litellm-1.0.5-py3-none-any.whl:

Publisher: release.yml on IgnatG/langcore-litellm

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page