Skip to main content

SIE integration for DSPy

Project description

sie-dspy

SIE integration for DSPy.

Installation

pip install sie-dspy

Features

  • SIEEmbedder: Embedding function for use with dspy.Embedder or dspy.retrievers.Embeddings
  • SIEReranker: Module to rerank passages by relevance to a query
  • SIEExtractor: Module to extract entities from text

Quick Start

Embeddings with FAISS Retriever

import dspy
from sie_dspy import SIEEmbedder

# Create SIE embedder
embedder = SIEEmbedder(
    base_url="http://localhost:8080",
    model="BAAI/bge-m3",
)

# Use with DSPy's built-in FAISS retriever
corpus = [
    "Machine learning enables systems to learn from data.",
    "Deep learning uses neural networks with multiple layers.",
    "Natural language processing analyzes human language.",
]

retriever = dspy.retrievers.Embeddings(
    corpus=corpus,
    embedder=embedder,
    k=2,
)

# Retrieve relevant passages
results = retriever("What is deep learning?")
print(results.passages)

Reranking Retrieved Passages

from sie_dspy import SIEReranker

# Create reranker module
reranker = SIEReranker(
    base_url="http://localhost:8080",
    model="jinaai/jina-reranker-v2-base-multilingual",
)

# Rerank passages
query = "How do neural networks learn?"
passages = [
    "The weather is sunny today.",
    "Neural networks learn through backpropagation.",
    "Deep learning models require large datasets.",
]

result = reranker(query=query, passages=passages, k=2)
print(result.passages)  # Top 2 most relevant passages

Entity Extraction

from sie_dspy import SIEExtractor

# Create extractor module
extractor = SIEExtractor(
    base_url="http://localhost:8080",
    model="urchade/gliner_multi-v2.1",
    labels=["person", "organization", "location"],
)

# Extract entities
text = "John Smith is the CEO of TechCorp in San Francisco."
result = extractor(text=text)
print(result.entities)  # [{"text": "John Smith", "label": "person", ...}, ...]

RAG Pipeline with Reranking

import dspy
from sie_dspy import SIEEmbedder, SIEReranker

class RAGWithReranking(dspy.Module):
    def __init__(self, corpus, embedder, reranker, k=5, rerank_k=3):
        super().__init__()
        self.retriever = dspy.retrievers.Embeddings(
            corpus=corpus,
            embedder=embedder,
            k=k,
        )
        self.reranker = reranker
        self.rerank_k = rerank_k
        self.generate = dspy.ChainOfThought("context, question -> answer")

    def forward(self, question):
        # Retrieve initial candidates
        retrieved = self.retriever(question)

        # Rerank to get most relevant
        reranked = self.reranker(
            query=question,
            passages=retrieved.passages,
            k=self.rerank_k,
        )

        # Generate answer
        context = "\n".join(reranked.passages)
        return self.generate(context=context, question=question)

# Usage
embedder = SIEEmbedder(base_url="http://localhost:8080", model="BAAI/bge-m3")
reranker = SIEReranker(base_url="http://localhost:8080")

rag = RAGWithReranking(
    corpus=["...your documents..."],
    embedder=embedder,
    reranker=reranker,
)
result = rag("What is machine learning?")

SIE Server

Start the SIE server before using this integration:

mise run serve -d cpu -p 8080

Testing

# Unit tests (no server required)
pytest

# Integration tests (requires running server)
pytest -m integration

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sie_dspy-0.3.0.tar.gz (14.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sie_dspy-0.3.0-py3-none-any.whl (7.4 kB view details)

Uploaded Python 3

File details

Details for the file sie_dspy-0.3.0.tar.gz.

File metadata

  • Download URL: sie_dspy-0.3.0.tar.gz
  • Upload date:
  • Size: 14.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for sie_dspy-0.3.0.tar.gz
Algorithm Hash digest
SHA256 055d1bf8d58fd692c79df95ee1db136d73ae5086f1ecca974691774f6502182c
MD5 925f502b94c27874d2cec1cabac215ca
BLAKE2b-256 8f9c564b422d317346eb9d32fed2e1111d1255a56de3fa7242c33740e7def17b

See more details on using hashes here.

Provenance

The following attestation bundles were made for sie_dspy-0.3.0.tar.gz:

Publisher: release-python.yml on superlinked/sie-internal

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sie_dspy-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: sie_dspy-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 7.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for sie_dspy-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 50cedd40356730ced542b6a16ab4d7cbb9b38237d83dd472d31658032ba338c1
MD5 70b6cec16d2a6e57c24d755daae1e0d4
BLAKE2b-256 108bc97931cc0796e83774d66d274ae2fe1538a535109bc098bef414ba621246

See more details on using hashes here.

Provenance

The following attestation bundles were made for sie_dspy-0.3.0-py3-none-any.whl:

Publisher: release-python.yml on superlinked/sie-internal

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page