Skip to main content

DSPy-powered prompt optimization plugin for LangCore — auto-optimize extraction prompts with MIPROv2 and GEPA

Project description

LangCore DSPy

Procider plugin for LangCore — automatically optimize extraction prompts and few-shot examples using DSPy.

PyPI version Python License


Overview

langcore-dspy is a plugin for LangCore that uses DSPy's optimization framework to automatically refine extraction prompts and curate few-shot examples. Given training data, it searches for the best prompt description and example set to maximize extraction precision and recall — then produces a portable OptimizedConfig you can save, load, and pass directly to lx.extract().


Features

  • MIPROv2 optimizer — fast, general-purpose prompt optimization that explores candidate prompts and selects the best performer
  • GEPA optimizer — reflective, feedback-driven optimization with falling back to BootstrapFewShot when dspy.GEPA is unavailable
  • Optimizer aliases — use mipro, mipro_v2, or miprov2 interchangeably; gepa for the reflective optimizer
  • Persist & load configs — save optimized configurations to disk (config.json + examples.json) and reload them later
  • Built-in evaluation — measure precision, recall, and F1 on held-out test sets with per-document detail
  • Native LangCore integration — pass optimized_config directly to lx.extract(), which overrides prompt_description and examples
  • Any LLM backend — works with any model supported by DSPy's LM abstraction (OpenAI, Google, Anthropic, etc.)

Installation

pip install langcore-dspy

Quick Start

1. Optimize Your Extraction Prompt

from langcore_dspy import DSPyOptimizer
import langcore as lx

# Prepare few-shot examples to guide optimization
examples = [
    lx.data.ExampleData(
        text="Invoice INV-001 for $500 due Jan 1, 2024",
        extractions=[
            lx.data.Extraction("invoice", "INV-001",
                               attributes={"amount": "500", "due": "2024-01-01"})
        ],
    )
]

# Training data the optimizer will use to evaluate candidates
train_texts = [
    "Invoice INV-002 totalling $1,200 payable by March 15, 2024",
    "Bill INV-003: $750, due date April 30, 2024",
]
expected_results = [
    [lx.data.Extraction("invoice", "INV-002",
                        attributes={"amount": "1200", "due": "2024-03-15"})],
    [lx.data.Extraction("invoice", "INV-003",
                        attributes={"amount": "750", "due": "2024-04-30"})],
]

# Run optimization
optimizer = DSPyOptimizer(model_id="openai/gpt-4o-mini")
config = optimizer.optimize(
    prompt_description="Extract invoice details: number, amount, due date.",
    examples=examples,
    train_texts=train_texts,
    expected_results=expected_results,
    optimizer="miprov2",
)

print(f"Optimized prompt: {config.prompt_description}")
print(f"Metadata: {config.metadata}")

2. Save & Load Optimized Configs

# Save to disk
config.save("./optimized_invoice_extractor")

# Load later
from langcore_dspy import OptimizedConfig
config = OptimizedConfig.load("./optimized_invoice_extractor")

The saved directory contains:

  • config.json — optimized prompt description and metadata
  • examples.json — curated few-shot examples

3. Use in LangCore Extraction

Pass the optimized config directly to lx.extract() — it overrides prompt_description and examples with the optimized values:

import langcore as lx
from langcore_dspy import OptimizedConfig

config = OptimizedConfig.load("./optimized_invoice_extractor")

result = lx.extract(
    text_or_documents="Invoice INV-100 for $2,300 due June 1, 2024",
    model_id="gemini-2.5-flash",
    optimized_config=config,
)

print(result)

4. Evaluate Performance

Measure extraction quality on a held-out test set:

metrics = config.evaluate(
    test_texts=["Invoice INV-200 for $900 due July 1, 2024"],
    expected_results=[
        [lx.data.Extraction("invoice", "INV-200",
                            attributes={"amount": "900", "due": "2024-07-01"})]
    ],
    extract_fn=lambda text: lx.extract(
        text_or_documents=text,
        model_id="gemini-2.5-flash",
        optimized_config=config,
    ),
    model_id="gemini-2.5-flash",
)

print(f"Precision: {metrics['precision']:.2f}")
print(f"Recall:    {metrics['recall']:.2f}")
print(f"F1:        {metrics['f1']:.2f}")

Supported Optimizers

Optimizer Key Aliases Description
MIPROv2 miprov2 mipro, mipro_v2 Fast, general-purpose prompt optimization. Recommended default.
GEPA gepa Reflective optimizer with feedback-driven refinement. Falls back to BootstrapFewShot if dspy.GEPA is unavailable.

API Reference

DSPyOptimizer

DSPyOptimizer(model_id: str, api_key: str | None = None, **lm_kwargs)
Parameter Description
model_id DSPy-compatible model identifier (e.g., "openai/gpt-4o-mini", "gemini/gemini-2.5-flash")
api_key Optional API key for the model provider
**lm_kwargs Additional keyword arguments forwarded to dspy.LM()

optimize()

optimizer.optimize(
    prompt_description: str,
    examples: list[ExampleData],
    train_texts: list[str],
    expected_results: list[list[Extraction]],
    optimizer: str = "miprov2",
    num_candidates: int = 7,
    max_bootstrapped_demos: int = 3,
    max_labeled_demos: int = 4,
) -> OptimizedConfig

OptimizedConfig

@dataclasses.dataclass
class OptimizedConfig:
    prompt_description: str
    examples: list[ExampleData]
    metadata: dict
Method Description
save(path) Persist to a directory (config.json + examples.json)
load(path) Class method — restore from a saved directory
evaluate(test_texts, expected_results, extract_fn, model_id) Compute precision, recall, and F1 on a test set

Composing with Other Plugins

langcore-dspy produces an OptimizedConfig that works with any LangCore provider stack:

import langcore as lx
from langcore_dspy import OptimizedConfig
from langcore_audit import AuditLanguageModel, LoggingSink
from langcore_guardrails import GuardrailLanguageModel, SchemaValidator, OnFailAction

# Load optimized prompt + examples
config = OptimizedConfig.load("./optimized_invoice_extractor")

# Build provider stack
llm = lx.factory.create_model(
    lx.factory.ModelConfig(model_id="litellm/gpt-4o", provider="LiteLLMLanguageModel")
)
guarded = GuardrailLanguageModel(
    model_id="guardrails/gpt-4o", inner=llm,
    validators=[SchemaValidator(Invoice, on_fail=OnFailAction.REASK)],
)
audited = AuditLanguageModel(
    model_id="audit/gpt-4o", inner=guarded,
    sinks=[LoggingSink()],
)

# Extract with optimized config + full provider stack
result = lx.extract(
    text_or_documents="Invoice INV-500 for $8,200 due Dec 31, 2025",
    model=audited,
    optimized_config=config,
)

Development

pip install -e ".[dev]"
pytest

Requirements

  • Python ≥ 3.12
  • langcore
  • dspy ≥ 2.6.0

License

Apache License 2.0 — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

langcore_dspy-1.1.2.tar.gz (21.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

langcore_dspy-1.1.2-py3-none-any.whl (16.4 kB view details)

Uploaded Python 3

File details

Details for the file langcore_dspy-1.1.2.tar.gz.

File metadata

  • Download URL: langcore_dspy-1.1.2.tar.gz
  • Upload date:
  • Size: 21.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for langcore_dspy-1.1.2.tar.gz
Algorithm Hash digest
SHA256 69f7b7a600cb94db15f12eeb834e3da91030e43d4d82b596c6cf490342436168
MD5 c6b7e142df92ffc967e6db52c7379184
BLAKE2b-256 638dee3c47f4601410e18505b885032e460560e9c44ed5be0e6bdfe2ee58ec7f

See more details on using hashes here.

Provenance

The following attestation bundles were made for langcore_dspy-1.1.2.tar.gz:

Publisher: release.yml on IgnatG/langcore-dspy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file langcore_dspy-1.1.2-py3-none-any.whl.

File metadata

  • Download URL: langcore_dspy-1.1.2-py3-none-any.whl
  • Upload date:
  • Size: 16.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for langcore_dspy-1.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 0b97134f9823995d5168d91bc92c2b2c4cfca7d68b9869ab72c1f8819e608bd9
MD5 2b93a0b954de53c88d44f3992e226823
BLAKE2b-256 075c9174f6ac4abffc8387b5b3b2fdd9fd6df13c4df24d0a28afcbdf15d211e6

See more details on using hashes here.

Provenance

The following attestation bundles were made for langcore_dspy-1.1.2-py3-none-any.whl:

Publisher: release.yml on IgnatG/langcore-dspy

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page