Skip to main content

SIE integration for DSPy

Project description

sie-dspy

SIE integration for DSPy.

Installation

pip install sie-dspy

Features

  • SIEEmbedder: Embedding function for use with dspy.Embedder or dspy.retrievers.Embeddings
  • SIEReranker: Module to rerank passages by relevance to a query
  • SIEExtractor: Module to extract entities from text

Quick Start

Embeddings with FAISS Retriever

import dspy
from sie_dspy import SIEEmbedder

# Create SIE embedder
embedder = SIEEmbedder(
    base_url="http://localhost:8080",
    model="BAAI/bge-m3",
)

# Use with DSPy's built-in FAISS retriever
corpus = [
    "Machine learning enables systems to learn from data.",
    "Deep learning uses neural networks with multiple layers.",
    "Natural language processing analyzes human language.",
]

retriever = dspy.retrievers.Embeddings(
    corpus=corpus,
    embedder=embedder,
    k=2,
)

# Retrieve relevant passages
results = retriever("What is deep learning?")
print(results.passages)

Reranking Retrieved Passages

from sie_dspy import SIEReranker

# Create reranker module
reranker = SIEReranker(
    base_url="http://localhost:8080",
    model="jinaai/jina-reranker-v2-base-multilingual",
)

# Rerank passages
query = "How do neural networks learn?"
passages = [
    "The weather is sunny today.",
    "Neural networks learn through backpropagation.",
    "Deep learning models require large datasets.",
]

result = reranker(query=query, passages=passages, k=2)
print(result.passages)  # Top 2 most relevant passages

Entity Extraction

from sie_dspy import SIEExtractor

# Create extractor module
extractor = SIEExtractor(
    base_url="http://localhost:8080",
    model="urchade/gliner_multi-v2.1",
    labels=["person", "organization", "location"],
)

# Extract entities
text = "John Smith is the CEO of TechCorp in San Francisco."
result = extractor(text=text)
print(result.entities)  # [{"text": "John Smith", "label": "person", ...}, ...]

RAG Pipeline with Reranking

import dspy
from sie_dspy import SIEEmbedder, SIEReranker

class RAGWithReranking(dspy.Module):
    def __init__(self, corpus, embedder, reranker, k=5, rerank_k=3):
        super().__init__()
        self.retriever = dspy.retrievers.Embeddings(
            corpus=corpus,
            embedder=embedder,
            k=k,
        )
        self.reranker = reranker
        self.rerank_k = rerank_k
        self.generate = dspy.ChainOfThought("context, question -> answer")

    def forward(self, question):
        # Retrieve initial candidates
        retrieved = self.retriever(question)

        # Rerank to get most relevant
        reranked = self.reranker(
            query=question,
            passages=retrieved.passages,
            k=self.rerank_k,
        )

        # Generate answer
        context = "\n".join(reranked.passages)
        return self.generate(context=context, question=question)

# Usage
embedder = SIEEmbedder(base_url="http://localhost:8080", model="BAAI/bge-m3")
reranker = SIEReranker(base_url="http://localhost:8080")

rag = RAGWithReranking(
    corpus=["...your documents..."],
    embedder=embedder,
    reranker=reranker,
)
result = rag("What is machine learning?")

SIE Server

Start the SIE server before using this integration:

mise run serve -d cpu -p 8080

Testing

# Unit tests (no server required)
pytest

# Integration tests (requires running server)
pytest -m integration

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sie_dspy-0.2.0.tar.gz (14.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sie_dspy-0.2.0-py3-none-any.whl (7.4 kB view details)

Uploaded Python 3

File details

Details for the file sie_dspy-0.2.0.tar.gz.

File metadata

  • Download URL: sie_dspy-0.2.0.tar.gz
  • Upload date:
  • Size: 14.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for sie_dspy-0.2.0.tar.gz
Algorithm Hash digest
SHA256 e7bda7152b3ace09e850148a1ebafc230aba8375032de289eda7d2a0b07805c4
MD5 bb0fa896881996c38ce4e4ebdaffbb71
BLAKE2b-256 b0335716e447fdb66533a38a6c808e97f3f7c6a5ad511c6933e9eb6d572547df

See more details on using hashes here.

Provenance

The following attestation bundles were made for sie_dspy-0.2.0.tar.gz:

Publisher: release-python.yml on superlinked/sie-internal

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sie_dspy-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: sie_dspy-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 7.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for sie_dspy-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3c847b1d78d66c1df835daddb616ca07c971a1f8e8b0280168da1d1e3116f77c
MD5 27f10ea100d3594ad51bb43deeb5ef8e
BLAKE2b-256 9723d4661bfc9f75d42022f9de13c4dd1d1af43181c7209731c8e0695caa0df6

See more details on using hashes here.

Provenance

The following attestation bundles were made for sie_dspy-0.2.0-py3-none-any.whl:

Publisher: release-python.yml on superlinked/sie-internal

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page