Skip to main content

Easily build powerful retrieval algorithms for AI Agents

Project description

Docs PyPI

Retrievers SDK

Installation

You can install the Retrievers SDK using pip:

pip install mo-retrievers

or using uv:

uv add mo-retrievers

Semantic Search Example

from retrievers import VectorDatabase, MilvusBackend
from retrievers.indexing import DummyEmbedder
from retrievers.context import Text

vdb = VectorDatabase(MilvusBackend.from_local("index.db"), embedder=DummyEmbedder(), payload_class=Text)
vdb.create_collection("docs", Text, exists_behavior="replace")
vdb.add_records("docs", [Text(text="Modaic makes sharing agents easy."), Text(text="Tables can be queried with SQL.")])
hits = vdb.search("docs", "How do I share agents?", k=1)
print(hits[0].text)

Working with Context

Text and chunking

from retrievers.context import Text

def simple_splitter(text: str):
    step = 500
    for i in range(0, len(text), step):
        yield text[i:i+step]

doc = Text.from_file("README.md")
doc.chunk_text(simple_splitter)
print(len(doc.chunks))

Tables and SQL queries

from pathlib import Path
from modaic.context import TableFile

table = TableFile.from_file(
    file_ref="employees.xlsx",
    file=Path("employees.xlsx"),
    file_type="xlsx",
)

print(table.schema_info())
head = table.query("SELECT * FROM this LIMIT 5")
print(head.shape)

Query Language

Build structured filters with Prop:

from retrievers.context import Prop

q = (Prop("age") >= 21) & (Prop("role") == "engineer")

Build a simple Indexer/Retriever with modaic

from typing import List, Tuple
import numpy as np
from modaic import Indexer, PrecompiledConfig
from retrievers.indexing import DummyEmbedder, Embedder  # convenient for demos
from retrievers.context import Text

class DocsConfig(PrecompiledConfig):
    index_name: str = "docs"

class DocsIndexer(Indexer):
    config: DocsConfig # ! Important: config must be annotated with the config class

    def __init__(self, config: DocsConfig, embedder: Embedder | None = None):
        super().__init__(config)
        self.embedder = embedder or DummyEmbedder(embedding_dim=128)
        self._records: list[Tuple[np.ndarray, Text]] = []

    def ingest(self, contexts: List[Text]):
        vectors = self.embedder([c.text for c in contexts])
        for v, c in zip(vectors, contexts):
            self._records.append((np.asarray(v), c))

    def retrieve(self, query: str, k: int = 5) -> List[Text]:
        qv = np.asarray(self.embedder(query))
        scored = [(float(np.dot(qv, v)), c) for v, c in self._records]
        scored.sort(key=lambda x: x[0], reverse=True)
        return [c for _, c in scored[:k]]

indexer = DocsIndexer(DocsConfig())
indexer.ingest([Text(text="Modaic makes sharing agents easy."), Text(text="Tables can be queried with SQL.")])
hits = indexer.retrieve("How do I share agents?", k=1)
print(hits[0].text)

Save and load your retriever:

indexer.push_to_hub("yourname/docs-indexer", commit_message="initial indexer")

from modaic import AutoRetriever
loaded_idx = AutoRetriever.from_precompiled("yourname/docs-indexer")
print(loaded_idx.retrieve("share agents", k=1)[0].text)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mo_retrievers-0.1.1.tar.gz (51.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mo_retrievers-0.1.1-py3-none-any.whl (53.0 kB view details)

Uploaded Python 3

File details

Details for the file mo_retrievers-0.1.1.tar.gz.

File metadata

  • Download URL: mo_retrievers-0.1.1.tar.gz
  • Upload date:
  • Size: 51.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.8.3

File hashes

Hashes for mo_retrievers-0.1.1.tar.gz
Algorithm Hash digest
SHA256 998ed5852880996604e1dc175e4cf2ad4810990d7a8668c410adc71ba1a1ec62
MD5 703f7068e41e14bfc8e8a08929ab2d24
BLAKE2b-256 2d771891725a8b4dea6d32938dbda87ffebeea8683c5fbdf21b5d82a889e8a58

See more details on using hashes here.

File details

Details for the file mo_retrievers-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for mo_retrievers-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 050d95fcfbb148891c5f5a7b6020b436fe2a1d7d004ff9b6d42470a6cfe4ff33
MD5 6c62b24dc55e81112dafe17637a0ebaa
BLAKE2b-256 b77267f8a2fba2b8fbfc46ea82226d10ca7211273521b2ed2a0e043c68c4e126

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page