Skip to main content

SIE integration for DSPy

Project description

sie-dspy

SIE integration for DSPy.

Installation

pip install sie-dspy

Features

  • SIEEmbedder: Embedding function for use with dspy.Embedder or dspy.retrievers.Embeddings
  • SIEReranker: Module to rerank passages by relevance to a query
  • SIEExtractor: Module to extract entities from text

Quick Start

Embeddings with FAISS Retriever

import dspy
from sie_dspy import SIEEmbedder

# Create SIE embedder
embedder = SIEEmbedder(
    base_url="http://localhost:8080",
    model="BAAI/bge-m3",
)

# Use with DSPy's built-in FAISS retriever
corpus = [
    "Machine learning enables systems to learn from data.",
    "Deep learning uses neural networks with multiple layers.",
    "Natural language processing analyzes human language.",
]

retriever = dspy.retrievers.Embeddings(
    corpus=corpus,
    embedder=embedder,
    k=2,
)

# Retrieve relevant passages
results = retriever("What is deep learning?")
print(results.passages)

Reranking Retrieved Passages

from sie_dspy import SIEReranker

# Create reranker module
reranker = SIEReranker(
    base_url="http://localhost:8080",
    model="jinaai/jina-reranker-v2-base-multilingual",
)

# Rerank passages
query = "How do neural networks learn?"
passages = [
    "The weather is sunny today.",
    "Neural networks learn through backpropagation.",
    "Deep learning models require large datasets.",
]

result = reranker(query=query, passages=passages, k=2)
print(result.passages)  # Top 2 most relevant passages

Entity Extraction

from sie_dspy import SIEExtractor

# Create extractor module
extractor = SIEExtractor(
    base_url="http://localhost:8080",
    model="urchade/gliner_multi-v2.1",
    labels=["person", "organization", "location"],
)

# Extract entities
text = "John Smith is the CEO of TechCorp in San Francisco."
result = extractor(text=text)
print(result.entities)  # [{"text": "John Smith", "label": "person", ...}, ...]

RAG Pipeline with Reranking

import dspy
from sie_dspy import SIEEmbedder, SIEReranker

class RAGWithReranking(dspy.Module):
    def __init__(self, corpus, embedder, reranker, k=5, rerank_k=3):
        super().__init__()
        self.retriever = dspy.retrievers.Embeddings(
            corpus=corpus,
            embedder=embedder,
            k=k,
        )
        self.reranker = reranker
        self.rerank_k = rerank_k
        self.generate = dspy.ChainOfThought("context, question -> answer")

    def forward(self, question):
        # Retrieve initial candidates
        retrieved = self.retriever(question)

        # Rerank to get most relevant
        reranked = self.reranker(
            query=question,
            passages=retrieved.passages,
            k=self.rerank_k,
        )

        # Generate answer
        context = "\n".join(reranked.passages)
        return self.generate(context=context, question=question)

# Usage
embedder = SIEEmbedder(base_url="http://localhost:8080", model="BAAI/bge-m3")
reranker = SIEReranker(base_url="http://localhost:8080")

rag = RAGWithReranking(
    corpus=["...your documents..."],
    embedder=embedder,
    reranker=reranker,
)
result = rag("What is machine learning?")

SIE Server

Start the SIE server before using this integration:

mise run serve -d cpu -p 8080

Testing

# Unit tests (no server required)
pytest

# Integration tests (requires running server)
pytest -m integration

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sie_dspy-0.1.7.tar.gz (13.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sie_dspy-0.1.7-py3-none-any.whl (7.1 kB view details)

Uploaded Python 3

File details

Details for the file sie_dspy-0.1.7.tar.gz.

File metadata

  • Download URL: sie_dspy-0.1.7.tar.gz
  • Upload date:
  • Size: 13.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sie_dspy-0.1.7.tar.gz
Algorithm Hash digest
SHA256 64ae2ed52a680eaae7818c24ee8c9f652c9542382013ccc479c06403e0fec0d3
MD5 c67e70d9a849ecd9784c1e0c0f13bdd5
BLAKE2b-256 e2ceee9f5f96a8a39f9d99e71149c075924c81b3b7051e710a77624418c38c77

See more details on using hashes here.

File details

Details for the file sie_dspy-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: sie_dspy-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 7.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sie_dspy-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 d6568c182431a25c11532da8197a1e1f8a2c6ebd7d1741e33e8b3e824652cd49
MD5 4740610a8ec11607cf1507e38348d4ed
BLAKE2b-256 9864e3553286099d458c430268db5a3b519d5d0b40ed62c7557e7806bbcc8386

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page