DSPy-powered prompt optimization plugin for LangCore — auto-optimize extraction prompts with MIPROv2 and GEPA
Project description
LangCore DSPy
Procider plugin for LangCore — automatically optimize extraction prompts and few-shot examples using DSPy.
Overview
langcore-dspy is a plugin for LangCore that uses DSPy's optimization framework to automatically refine extraction prompts and curate few-shot examples. Given training data, it searches for the best prompt description and example set to maximize extraction precision and recall — then produces a portable OptimizedConfig you can save, load, and pass directly to lx.extract().
Features
- MIPROv2 optimizer — fast, general-purpose prompt optimization that explores candidate prompts and selects the best performer
- GEPA optimizer — reflective, feedback-driven optimization with falling back to
BootstrapFewShotwhendspy.GEPAis unavailable - Optimizer aliases — use
mipro,mipro_v2, ormiprov2interchangeably;gepafor the reflective optimizer - Persist & load configs — save optimized configurations to disk (
config.json+examples.json) and reload them later - Built-in evaluation — measure precision, recall, and F1 on held-out test sets with per-document detail
- Native LangCore integration — pass
optimized_configdirectly tolx.extract(), which overridesprompt_descriptionandexamples - Any LLM backend — works with any model supported by DSPy's LM abstraction (OpenAI, Google, Anthropic, etc.)
Installation
pip install langcore-dspy
Quick Start
1. Optimize Your Extraction Prompt
from langcore_dspy import DSPyOptimizer
import langcore as lx
# Prepare few-shot examples to guide optimization
examples = [
lx.data.ExampleData(
text="Invoice INV-001 for $500 due Jan 1, 2024",
extractions=[
lx.data.Extraction("invoice", "INV-001",
attributes={"amount": "500", "due": "2024-01-01"})
],
)
]
# Training data the optimizer will use to evaluate candidates
train_texts = [
"Invoice INV-002 totalling $1,200 payable by March 15, 2024",
"Bill INV-003: $750, due date April 30, 2024",
]
expected_results = [
[lx.data.Extraction("invoice", "INV-002",
attributes={"amount": "1200", "due": "2024-03-15"})],
[lx.data.Extraction("invoice", "INV-003",
attributes={"amount": "750", "due": "2024-04-30"})],
]
# Run optimization
optimizer = DSPyOptimizer(model_id="openai/gpt-4o-mini")
config = optimizer.optimize(
prompt_description="Extract invoice details: number, amount, due date.",
examples=examples,
train_texts=train_texts,
expected_results=expected_results,
optimizer="miprov2",
)
print(f"Optimized prompt: {config.prompt_description}")
print(f"Metadata: {config.metadata}")
2. Save & Load Optimized Configs
# Save to disk
config.save("./optimized_invoice_extractor")
# Load later
from langcore_dspy import OptimizedConfig
config = OptimizedConfig.load("./optimized_invoice_extractor")
The saved directory contains:
config.json— optimized prompt description and metadataexamples.json— curated few-shot examples
3. Use in LangCore Extraction
Pass the optimized config directly to lx.extract() — it overrides prompt_description and examples with the optimized values:
import langcore as lx
from langcore_dspy import OptimizedConfig
config = OptimizedConfig.load("./optimized_invoice_extractor")
result = lx.extract(
text_or_documents="Invoice INV-100 for $2,300 due June 1, 2024",
model_id="gemini-2.5-flash",
optimized_config=config,
)
print(result)
4. Evaluate Performance
Measure extraction quality on a held-out test set:
metrics = config.evaluate(
test_texts=["Invoice INV-200 for $900 due July 1, 2024"],
expected_results=[
[lx.data.Extraction("invoice", "INV-200",
attributes={"amount": "900", "due": "2024-07-01"})]
],
extract_fn=lambda text: lx.extract(
text_or_documents=text,
model_id="gemini-2.5-flash",
optimized_config=config,
),
model_id="gemini-2.5-flash",
)
print(f"Precision: {metrics['precision']:.2f}")
print(f"Recall: {metrics['recall']:.2f}")
print(f"F1: {metrics['f1']:.2f}")
Supported Optimizers
| Optimizer | Key | Aliases | Description |
|---|---|---|---|
| MIPROv2 | miprov2 |
mipro, mipro_v2 |
Fast, general-purpose prompt optimization. Recommended default. |
| GEPA | gepa |
— | Reflective optimizer with feedback-driven refinement. Falls back to BootstrapFewShot if dspy.GEPA is unavailable. |
API Reference
DSPyOptimizer
DSPyOptimizer(model_id: str, api_key: str | None = None, **lm_kwargs)
| Parameter | Description |
|---|---|
model_id |
DSPy-compatible model identifier (e.g., "openai/gpt-4o-mini", "gemini/gemini-2.5-flash") |
api_key |
Optional API key for the model provider |
**lm_kwargs |
Additional keyword arguments forwarded to dspy.LM() |
optimize()
optimizer.optimize(
prompt_description: str,
examples: list[ExampleData],
train_texts: list[str],
expected_results: list[list[Extraction]],
optimizer: str = "miprov2",
num_candidates: int = 7,
max_bootstrapped_demos: int = 3,
max_labeled_demos: int = 4,
) -> OptimizedConfig
OptimizedConfig
@dataclasses.dataclass
class OptimizedConfig:
prompt_description: str
examples: list[ExampleData]
metadata: dict
| Method | Description |
|---|---|
save(path) |
Persist to a directory (config.json + examples.json) |
load(path) |
Class method — restore from a saved directory |
evaluate(test_texts, expected_results, extract_fn, model_id) |
Compute precision, recall, and F1 on a test set |
Composing with Other Plugins
langcore-dspy produces an OptimizedConfig that works with any LangCore provider stack:
import langcore as lx
from langcore_dspy import OptimizedConfig
from langcore_audit import AuditLanguageModel, LoggingSink
from langcore_guardrails import GuardrailLanguageModel, SchemaValidator, OnFailAction
# Load optimized prompt + examples
config = OptimizedConfig.load("./optimized_invoice_extractor")
# Build provider stack
llm = lx.factory.create_model(
lx.factory.ModelConfig(model_id="litellm/gpt-4o", provider="LiteLLMLanguageModel")
)
guarded = GuardrailLanguageModel(
model_id="guardrails/gpt-4o", inner=llm,
validators=[SchemaValidator(Invoice, on_fail=OnFailAction.REASK)],
)
audited = AuditLanguageModel(
model_id="audit/gpt-4o", inner=guarded,
sinks=[LoggingSink()],
)
# Extract with optimized config + full provider stack
result = lx.extract(
text_or_documents="Invoice INV-500 for $8,200 due Dec 31, 2025",
model=audited,
optimized_config=config,
)
Development
pip install -e ".[dev]"
pytest
Requirements
- Python ≥ 3.12
langcoredspy≥ 2.6.0
License
Apache License 2.0 — see LICENSE for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file langcore_dspy-1.1.2.tar.gz.
File metadata
- Download URL: langcore_dspy-1.1.2.tar.gz
- Upload date:
- Size: 21.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
69f7b7a600cb94db15f12eeb834e3da91030e43d4d82b596c6cf490342436168
|
|
| MD5 |
c6b7e142df92ffc967e6db52c7379184
|
|
| BLAKE2b-256 |
638dee3c47f4601410e18505b885032e460560e9c44ed5be0e6bdfe2ee58ec7f
|
Provenance
The following attestation bundles were made for langcore_dspy-1.1.2.tar.gz:
Publisher:
release.yml on IgnatG/langcore-dspy
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
langcore_dspy-1.1.2.tar.gz -
Subject digest:
69f7b7a600cb94db15f12eeb834e3da91030e43d4d82b596c6cf490342436168 - Sigstore transparency entry: 985037118
- Sigstore integration time:
-
Permalink:
IgnatG/langcore-dspy@fc959b435463e74492f63d6b66310dbadc7cf9cd -
Branch / Tag:
refs/heads/main - Owner: https://github.com/IgnatG
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@fc959b435463e74492f63d6b66310dbadc7cf9cd -
Trigger Event:
push
-
Statement type:
File details
Details for the file langcore_dspy-1.1.2-py3-none-any.whl.
File metadata
- Download URL: langcore_dspy-1.1.2-py3-none-any.whl
- Upload date:
- Size: 16.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0b97134f9823995d5168d91bc92c2b2c4cfca7d68b9869ab72c1f8819e608bd9
|
|
| MD5 |
2b93a0b954de53c88d44f3992e226823
|
|
| BLAKE2b-256 |
075c9174f6ac4abffc8387b5b3b2fdd9fd6df13c4df24d0a28afcbdf15d211e6
|
Provenance
The following attestation bundles were made for langcore_dspy-1.1.2-py3-none-any.whl:
Publisher:
release.yml on IgnatG/langcore-dspy
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
langcore_dspy-1.1.2-py3-none-any.whl -
Subject digest:
0b97134f9823995d5168d91bc92c2b2c4cfca7d68b9869ab72c1f8819e608bd9 - Sigstore transparency entry: 985037124
- Sigstore integration time:
-
Permalink:
IgnatG/langcore-dspy@fc959b435463e74492f63d6b66310dbadc7cf9cd -
Branch / Tag:
refs/heads/main - Owner: https://github.com/IgnatG
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@fc959b435463e74492f63d6b66310dbadc7cf9cd -
Trigger Event:
push
-
Statement type: