Skip to main content

Easily build powerful retrieval algorithms for AI Agents

Project description

Retrievers SDK

Installation

You can install the Retrievers SDK using pip:

pip install mo-retrievers

or using uv:

uv add mo-retrievers

Semantic Search Example

from retrievers import VectorDatabase, MilvusBackend
from retrievers.indexing import DummyEmbedder
from retrievers.context import Text

vdb = VectorDatabase(MilvusBackend.from_local("index.db"), embedder=DummyEmbedder(), payload_class=Text)
vdb.create_collection("docs", Text, exists_behavior="replace")
vdb.add_records("docs", [Text(text="Modaic makes sharing agents easy."), Text(text="Tables can be queried with SQL.")])
hits = vdb.search("docs", "How do I share agents?", k=1)
print(hits[0].text)

Working with Context

Text and chunking

from retrievers.context import Text

def simple_splitter(text: str):
    step = 500
    for i in range(0, len(text), step):
        yield text[i:i+step]

doc = Text.from_file("README.md")
doc.chunk_text(simple_splitter)
print(len(doc.chunks))

Tables and SQL queries

from pathlib import Path
from modaic.context import TableFile

table = TableFile.from_file(
    file_ref="employees.xlsx",
    file=Path("employees.xlsx"),
    file_type="xlsx",
)

print(table.schema_info())
head = table.query("SELECT * FROM this LIMIT 5")
print(head.shape)

Query Language

Build structured filters with Prop:

from retrievers.context import Prop

q = (Prop("age") >= 21) & (Prop("role") == "engineer")

Build a simple Indexer/Retriever with modaic

from typing import List, Tuple
import numpy as np
from modaic import Indexer, PrecompiledConfig
from retrievers.indexing import DummyEmbedder, Embedder  # convenient for demos
from retrievers.context import Text

class DocsConfig(PrecompiledConfig):
    index_name: str = "docs"

class DocsIndexer(Indexer):
    config: DocsConfig # ! Important: config must be annotated with the config class

    def __init__(self, config: DocsConfig, embedder: Embedder | None = None):
        super().__init__(config)
        self.embedder = embedder or DummyEmbedder(embedding_dim=128)
        self._records: list[Tuple[np.ndarray, Text]] = []

    def ingest(self, contexts: List[Text]):
        vectors = self.embedder([c.text for c in contexts])
        for v, c in zip(vectors, contexts):
            self._records.append((np.asarray(v), c))

    def retrieve(self, query: str, k: int = 5) -> List[Text]:
        qv = np.asarray(self.embedder(query))
        scored = [(float(np.dot(qv, v)), c) for v, c in self._records]
        scored.sort(key=lambda x: x[0], reverse=True)
        return [c for _, c in scored[:k]]

indexer = DocsIndexer(DocsConfig())
indexer.ingest([Text(text="Modaic makes sharing agents easy."), Text(text="Tables can be queried with SQL.")])
hits = indexer.retrieve("How do I share agents?", k=1)
print(hits[0].text)

Save and load your retriever:

indexer.push_to_hub("yourname/docs-indexer", commit_message="initial indexer")

from modaic import AutoRetriever
loaded_idx = AutoRetriever.from_precompiled("yourname/docs-indexer")
print(loaded_idx.retrieve("share agents", k=1)[0].text)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mo_retrievers-0.1.0.tar.gz (51.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mo_retrievers-0.1.0-py3-none-any.whl (53.0 kB view details)

Uploaded Python 3

File details

Details for the file mo_retrievers-0.1.0.tar.gz.

File metadata

  • Download URL: mo_retrievers-0.1.0.tar.gz
  • Upload date:
  • Size: 51.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.3

File hashes

Hashes for mo_retrievers-0.1.0.tar.gz
Algorithm Hash digest
SHA256 bd4ef387cf157d74010f5c532ff881a63cda214fc4e25f89607383a81aaa5bca
MD5 b964875ee683d912f6b23343af4f51f0
BLAKE2b-256 9ed0dd0428f31d927c53214d75ca794d61b6d518ad6a0df15ff7f55fc014575f

See more details on using hashes here.

File details

Details for the file mo_retrievers-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for mo_retrievers-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ae7105da7d50a1eb7b9227647001b1ddb3a7b6786bc16b8fd712e604735249e3
MD5 81ce6b85ed50c64442414d3b26ab36fd
BLAKE2b-256 01319053da8365bd8109104f03c0ac613685f9d8bb8d64e0f888939d8816da2d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page