SIE integration for DSPy
Project description
sie-dspy
SIE integration for DSPy.
Installation
pip install sie-dspy
Features
- SIEEmbedder: Embedding function for use with
dspy.Embedderordspy.retrievers.Embeddings - SIEReranker: Module to rerank passages by relevance to a query
- SIEExtractor: Module to extract entities from text
Quick Start
Embeddings with FAISS Retriever
import dspy
from sie_dspy import SIEEmbedder
# Create SIE embedder
embedder = SIEEmbedder(
base_url="http://localhost:8080",
model="BAAI/bge-m3",
)
# Use with DSPy's built-in FAISS retriever
corpus = [
"Machine learning enables systems to learn from data.",
"Deep learning uses neural networks with multiple layers.",
"Natural language processing analyzes human language.",
]
retriever = dspy.retrievers.Embeddings(
corpus=corpus,
embedder=embedder,
k=2,
)
# Retrieve relevant passages
results = retriever("What is deep learning?")
print(results.passages)
Reranking Retrieved Passages
from sie_dspy import SIEReranker
# Create reranker module
reranker = SIEReranker(
base_url="http://localhost:8080",
model="jinaai/jina-reranker-v2-base-multilingual",
)
# Rerank passages
query = "How do neural networks learn?"
passages = [
"The weather is sunny today.",
"Neural networks learn through backpropagation.",
"Deep learning models require large datasets.",
]
result = reranker(query=query, passages=passages, k=2)
print(result.passages) # Top 2 most relevant passages
Entity Extraction
from sie_dspy import SIEExtractor
# Create extractor module
extractor = SIEExtractor(
base_url="http://localhost:8080",
model="urchade/gliner_multi-v2.1",
labels=["person", "organization", "location"],
)
# Extract entities
text = "John Smith is the CEO of TechCorp in San Francisco."
result = extractor(text=text)
print(result.entities) # [{"text": "John Smith", "label": "person", ...}, ...]
RAG Pipeline with Reranking
import dspy
from sie_dspy import SIEEmbedder, SIEReranker
class RAGWithReranking(dspy.Module):
def __init__(self, corpus, embedder, reranker, k=5, rerank_k=3):
super().__init__()
self.retriever = dspy.retrievers.Embeddings(
corpus=corpus,
embedder=embedder,
k=k,
)
self.reranker = reranker
self.rerank_k = rerank_k
self.generate = dspy.ChainOfThought("context, question -> answer")
def forward(self, question):
# Retrieve initial candidates
retrieved = self.retriever(question)
# Rerank to get most relevant
reranked = self.reranker(
query=question,
passages=retrieved.passages,
k=self.rerank_k,
)
# Generate answer
context = "\n".join(reranked.passages)
return self.generate(context=context, question=question)
# Usage
embedder = SIEEmbedder(base_url="http://localhost:8080", model="BAAI/bge-m3")
reranker = SIEReranker(base_url="http://localhost:8080")
rag = RAGWithReranking(
corpus=["...your documents..."],
embedder=embedder,
reranker=reranker,
)
result = rag("What is machine learning?")
SIE Server
Start the SIE server before using this integration:
mise run serve -d cpu -p 8080
Testing
# Unit tests (no server required)
pytest
# Integration tests (requires running server)
pytest -m integration
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sie_dspy-0.3.1.tar.gz.
File metadata
- Download URL: sie_dspy-0.3.1.tar.gz
- Upload date:
- Size: 14.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0d8ec817641699104f9d6eec340f43d724bfaf8e19a0d659d58e3c48391e2a11
|
|
| MD5 |
3fccb621c47f0150f9d440bc9da2f8b7
|
|
| BLAKE2b-256 |
0cc2ca58d2ca6530a7da7a6f15d2a9214e002eb9b164385aa09fd85537c112d8
|
Provenance
The following attestation bundles were made for sie_dspy-0.3.1.tar.gz:
Publisher:
release-python.yml on superlinked/sie-internal
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sie_dspy-0.3.1.tar.gz -
Subject digest:
0d8ec817641699104f9d6eec340f43d724bfaf8e19a0d659d58e3c48391e2a11 - Sigstore transparency entry: 1406062155
- Sigstore integration time:
-
Permalink:
superlinked/sie-internal@c67ee544d48f24bf79d6110fd4d19541fcc07c65 -
Branch / Tag:
refs/tags/v0.3.1 - Owner: https://github.com/superlinked
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
self-hosted -
Publication workflow:
release-python.yml@c67ee544d48f24bf79d6110fd4d19541fcc07c65 -
Trigger Event:
push
-
Statement type:
File details
Details for the file sie_dspy-0.3.1-py3-none-any.whl.
File metadata
- Download URL: sie_dspy-0.3.1-py3-none-any.whl
- Upload date:
- Size: 7.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c5b77b09c0fd54cf4a09fd946df7a8fc2a112469bd5331e0751284ab2b1de90c
|
|
| MD5 |
8fb1b7b5e753ce61a03b44962682478e
|
|
| BLAKE2b-256 |
5c263378d146eaa8966bb940d02a2ce05be70117516efbe0265b44c01664e6b1
|
Provenance
The following attestation bundles were made for sie_dspy-0.3.1-py3-none-any.whl:
Publisher:
release-python.yml on superlinked/sie-internal
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sie_dspy-0.3.1-py3-none-any.whl -
Subject digest:
c5b77b09c0fd54cf4a09fd946df7a8fc2a112469bd5331e0751284ab2b1de90c - Sigstore transparency entry: 1406062181
- Sigstore integration time:
-
Permalink:
superlinked/sie-internal@c67ee544d48f24bf79d6110fd4d19541fcc07c65 -
Branch / Tag:
refs/tags/v0.3.1 - Owner: https://github.com/superlinked
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
self-hosted -
Publication workflow:
release-python.yml@c67ee544d48f24bf79d6110fd4d19541fcc07c65 -
Trigger Event:
push
-
Statement type: