SIE integration for DSPy
Project description
sie-dspy
SIE integration for DSPy.
Installation
pip install sie-dspy
Features
- SIEEmbedder: Embedding function for use with
dspy.Embedderordspy.retrievers.Embeddings - SIEReranker: Module to rerank passages by relevance to a query
- SIEExtractor: Module to extract entities from text
Quick Start
Embeddings with FAISS Retriever
import dspy
from sie_dspy import SIEEmbedder
# Create SIE embedder
embedder = SIEEmbedder(
base_url="http://localhost:8080",
model="BAAI/bge-m3",
)
# Use with DSPy's built-in FAISS retriever
corpus = [
"Machine learning enables systems to learn from data.",
"Deep learning uses neural networks with multiple layers.",
"Natural language processing analyzes human language.",
]
retriever = dspy.retrievers.Embeddings(
corpus=corpus,
embedder=embedder,
k=2,
)
# Retrieve relevant passages
results = retriever("What is deep learning?")
print(results.passages)
Reranking Retrieved Passages
from sie_dspy import SIEReranker
# Create reranker module
reranker = SIEReranker(
base_url="http://localhost:8080",
model="jinaai/jina-reranker-v2-base-multilingual",
)
# Rerank passages
query = "How do neural networks learn?"
passages = [
"The weather is sunny today.",
"Neural networks learn through backpropagation.",
"Deep learning models require large datasets.",
]
result = reranker(query=query, passages=passages, k=2)
print(result.passages) # Top 2 most relevant passages
Entity Extraction
from sie_dspy import SIEExtractor
# Create extractor module
extractor = SIEExtractor(
base_url="http://localhost:8080",
model="urchade/gliner_multi-v2.1",
labels=["person", "organization", "location"],
)
# Extract entities
text = "John Smith is the CEO of TechCorp in San Francisco."
result = extractor(text=text)
print(result.entities) # [{"text": "John Smith", "label": "person", ...}, ...]
RAG Pipeline with Reranking
import dspy
from sie_dspy import SIEEmbedder, SIEReranker
class RAGWithReranking(dspy.Module):
def __init__(self, corpus, embedder, reranker, k=5, rerank_k=3):
super().__init__()
self.retriever = dspy.retrievers.Embeddings(
corpus=corpus,
embedder=embedder,
k=k,
)
self.reranker = reranker
self.rerank_k = rerank_k
self.generate = dspy.ChainOfThought("context, question -> answer")
def forward(self, question):
# Retrieve initial candidates
retrieved = self.retriever(question)
# Rerank to get most relevant
reranked = self.reranker(
query=question,
passages=retrieved.passages,
k=self.rerank_k,
)
# Generate answer
context = "\n".join(reranked.passages)
return self.generate(context=context, question=question)
# Usage
embedder = SIEEmbedder(base_url="http://localhost:8080", model="BAAI/bge-m3")
reranker = SIEReranker(base_url="http://localhost:8080")
rag = RAGWithReranking(
corpus=["...your documents..."],
embedder=embedder,
reranker=reranker,
)
result = rag("What is machine learning?")
SIE Server
Start the SIE server before using this integration:
mise run serve -d cpu -p 8080
Testing
# Unit tests (no server required)
pytest
# Integration tests (requires running server)
pytest -m integration
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
sie_dspy-0.1.7.tar.gz
(13.6 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sie_dspy-0.1.7.tar.gz.
File metadata
- Download URL: sie_dspy-0.1.7.tar.gz
- Upload date:
- Size: 13.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
64ae2ed52a680eaae7818c24ee8c9f652c9542382013ccc479c06403e0fec0d3
|
|
| MD5 |
c67e70d9a849ecd9784c1e0c0f13bdd5
|
|
| BLAKE2b-256 |
e2ceee9f5f96a8a39f9d99e71149c075924c81b3b7051e710a77624418c38c77
|
File details
Details for the file sie_dspy-0.1.7-py3-none-any.whl.
File metadata
- Download URL: sie_dspy-0.1.7-py3-none-any.whl
- Upload date:
- Size: 7.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d6568c182431a25c11532da8197a1e1f8a2c6ebd7d1741e33e8b3e824652cd49
|
|
| MD5 |
4740610a8ec11607cf1507e38348d4ed
|
|
| BLAKE2b-256 |
9864e3553286099d458c430268db5a3b519d5d0b40ed62c7557e7806bbcc8386
|