Skip to main content

SIE integration for DSPy

Project description

sie-dspy

SIE integration for DSPy.

Installation

pip install sie-dspy

Features

  • SIEEmbedder: Embedding function for use with dspy.Embedder or dspy.retrievers.Embeddings
  • SIEReranker: Module to rerank passages by relevance to a query
  • SIEExtractor: Module to extract entities from text

Quick Start

Embeddings with FAISS Retriever

import dspy
from sie_dspy import SIEEmbedder

# Create SIE embedder
embedder = SIEEmbedder(
    base_url="http://localhost:8080",
    model="BAAI/bge-m3",
)

# Use with DSPy's built-in FAISS retriever
corpus = [
    "Machine learning enables systems to learn from data.",
    "Deep learning uses neural networks with multiple layers.",
    "Natural language processing analyzes human language.",
]

retriever = dspy.retrievers.Embeddings(
    corpus=corpus,
    embedder=embedder,
    k=2,
)

# Retrieve relevant passages
results = retriever("What is deep learning?")
print(results.passages)

Reranking Retrieved Passages

from sie_dspy import SIEReranker

# Create reranker module
reranker = SIEReranker(
    base_url="http://localhost:8080",
    model="jinaai/jina-reranker-v2-base-multilingual",
)

# Rerank passages
query = "How do neural networks learn?"
passages = [
    "The weather is sunny today.",
    "Neural networks learn through backpropagation.",
    "Deep learning models require large datasets.",
]

result = reranker(query=query, passages=passages, k=2)
print(result.passages)  # Top 2 most relevant passages

Entity Extraction

from sie_dspy import SIEExtractor

# Create extractor module
extractor = SIEExtractor(
    base_url="http://localhost:8080",
    model="urchade/gliner_multi-v2.1",
    labels=["person", "organization", "location"],
)

# Extract entities
text = "John Smith is the CEO of TechCorp in San Francisco."
result = extractor(text=text)
print(result.entities)  # [{"text": "John Smith", "label": "person", ...}, ...]

RAG Pipeline with Reranking

import dspy
from sie_dspy import SIEEmbedder, SIEReranker

class RAGWithReranking(dspy.Module):
    def __init__(self, corpus, embedder, reranker, k=5, rerank_k=3):
        super().__init__()
        self.retriever = dspy.retrievers.Embeddings(
            corpus=corpus,
            embedder=embedder,
            k=k,
        )
        self.reranker = reranker
        self.rerank_k = rerank_k
        self.generate = dspy.ChainOfThought("context, question -> answer")

    def forward(self, question):
        # Retrieve initial candidates
        retrieved = self.retriever(question)

        # Rerank to get most relevant
        reranked = self.reranker(
            query=question,
            passages=retrieved.passages,
            k=self.rerank_k,
        )

        # Generate answer
        context = "\n".join(reranked.passages)
        return self.generate(context=context, question=question)

# Usage
embedder = SIEEmbedder(base_url="http://localhost:8080", model="BAAI/bge-m3")
reranker = SIEReranker(base_url="http://localhost:8080")

rag = RAGWithReranking(
    corpus=["...your documents..."],
    embedder=embedder,
    reranker=reranker,
)
result = rag("What is machine learning?")

SIE Server

Start the SIE server before using this integration:

mise run serve -d cpu -p 8080

Testing

# Unit tests (no server required)
pytest

# Integration tests (requires running server)
pytest -m integration

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sie_dspy-0.3.1.tar.gz (14.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sie_dspy-0.3.1-py3-none-any.whl (7.4 kB view details)

Uploaded Python 3

File details

Details for the file sie_dspy-0.3.1.tar.gz.

File metadata

  • Download URL: sie_dspy-0.3.1.tar.gz
  • Upload date:
  • Size: 14.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for sie_dspy-0.3.1.tar.gz
Algorithm Hash digest
SHA256 0d8ec817641699104f9d6eec340f43d724bfaf8e19a0d659d58e3c48391e2a11
MD5 3fccb621c47f0150f9d440bc9da2f8b7
BLAKE2b-256 0cc2ca58d2ca6530a7da7a6f15d2a9214e002eb9b164385aa09fd85537c112d8

See more details on using hashes here.

Provenance

The following attestation bundles were made for sie_dspy-0.3.1.tar.gz:

Publisher: release-python.yml on superlinked/sie-internal

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sie_dspy-0.3.1-py3-none-any.whl.

File metadata

  • Download URL: sie_dspy-0.3.1-py3-none-any.whl
  • Upload date:
  • Size: 7.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for sie_dspy-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c5b77b09c0fd54cf4a09fd946df7a8fc2a112469bd5331e0751284ab2b1de90c
MD5 8fb1b7b5e753ce61a03b44962682478e
BLAKE2b-256 5c263378d146eaa8966bb940d02a2ce05be70117516efbe0265b44c01664e6b1

See more details on using hashes here.

Provenance

The following attestation bundles were made for sie_dspy-0.3.1-py3-none-any.whl:

Publisher: release-python.yml on superlinked/sie-internal

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page