Hierarchical, search-first tool discovery for LLM agents. Give the model 3 meta-tools instead of a 30k-token catalog.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

SIFT — Search · Inspect · Filter · Trigger

Hierarchical, search-first tool discovery for LLM agents. Give the model 3 meta-tools instead of a 30k-token catalogue — it discovers the rest by navigating. Drop-in for OpenAI function-calling, LangChain, or MCP.

Repo: github.com/Victor-Alves0/SIFT

from sift import Sift

sift = Sift()

@sift.tool("google_workspace.gmail.read",
           description="Read emails from the inbox",
           params={"q": "string:o:is:unread:search query", "m": "number:o:10:max"},
           returns=["id", "subject", "from", "snippet", "date"])
def gmail_read(q="is:unread", m=10):
    ...  # call the real Gmail API
    return {"id": "1", "subject": "Hi", "from": "a@b.c", "snippet": "...",
            "date": "2026-06-30", "body": "filtered out by the whitelist"}

sift.build_index()

sift.search_tools("read my last email")              # → ranked candidate paths
sift.get_tool_schema("google_workspace.gmail.read")  # → compact TOON schema
sift.execute_tool("google_workspace.gmail.read", {"m": 1})  # → run + filter

Why

The model never sees the whole catalogue — only 3 tools. It discovers what it needs by walking category → service → function. The system prompt stays a fixed ~200 tokens whether you have 5 tools or 5,000. Adding a tool is one decorator. Schemas are returned in TOON (one line per tool), and responses are filtered to a per-tool whitelist.

search_tools(q)            → semantic discovery (local embeddings)   [Search]
get_tool_schema(path)      → hierarchical navigation, TOON schema     [Inspect]
execute_tool(path, params) → run + response filtering                 [Trigger + Filter]

Install

pip install sift-tools                 # core (local embeddings, no API key)
pip install "sift-tools[langchain]"    # + LangChain adapter
pip install "sift-tools[mcp]"          # + MCP server adapter
pip install "sift-tools[all,dev]"      # everything + test tooling

Embeddings run locally via fastembed (ONNX) — no embedding API key needed. Swap in any embedder with an embed(texts) -> list[vector] method.

Bring your own model (provider-agnostic)

The core is LLM-agnostic — it never calls a model itself. It hands you the 3 tool specs + a system prompt, and sift.dispatch(name, args) executes whatever tool call your model emits. Wire it to any provider:

# 1) OpenAI-compatible (OpenAI, OpenRouter, DeepSeek, Together, Groq, Mistral,
#    and LOCAL servers: Ollama / LM Studio / vLLM) — works out of the box
from openai import OpenAI
from sift.adapters.openai import run_agent

client = OpenAI()                                              # OpenAI
client = OpenAI(base_url="http://localhost:11434/v1", api_key="ollama")  # Ollama, local
client = OpenAI(base_url="https://openrouter.ai/api/v1", api_key=KEY)    # OpenRouter
run_agent(sift, client, "gpt-4o-mini", "what's my last email?")

# 2) Native Anthropic (Messages API)
import anthropic
from sift.adapters.anthropic import run_agent as run_claude
run_claude(sift, anthropic.Anthropic(), "claude-haiku-4.5", "what's my last email?")

# 3) LangChain (Anthropic, Gemini, Cohere, Bedrock, Ollama, ...)
agent_tools = sift.langchain_tools()        # plug into any LangChain agent

# 4) Expose SIFT itself as an MCP server (Claude Desktop, IDEs, ...)
sift.serve_mcp()

# 5) Any other SDK — the universal primitive:
specs  = sift.openai_tools()                # give your model the 3 tool specs
system = sift.system_prompt
answer = sift.dispatch(name, arguments)     # run a tool call -> string back

Provider / path	How	Status
OpenAI-compatible (incl. local Ollama/vLLM)	`openai_tools()` + `dispatch()` / `adapters.openai.run_agent`	✅ live-tested
Native Anthropic	`adapters.anthropic.run_agent`	✅ unit + offline-tested
LangChain	`langchain_tools()`	✅ live-tested
MCP clients	`serve_mcp()`	✅
No native tool calling (base/small models)	`adapters.prompted`	✅ live-tested

Weak or no-tool-calling models (Llama 3B, base models, …)

dispatch is format-agnostic, so any text model can drive SIFT via a prompted JSON protocol — no native function calling required:

from sift.adapters.prompted import run_agent, single_decision

def generate(prompt: str) -> str:      # wrap ANY text model (HF, llama.cpp, Ollama)
    return my_model(prompt)

run_agent(sift, generate, "what's my last email?")     # text-protocol tool loop
single_decision(sift, generate, "read my last email")  # 1 decision, for the weakest models

For small local models, constrain the decoder so output is always parseable:

sift.tool_call_schema()   # JSON Schema -> Outlines / LM Format Enforcer / vLLM guided_json
sift.json_gbnf()          # GBNF grammar -> llama.cpp

SIFT's tiny 3-tool surface actually helps weak models (less to get lost in). Realistic floor is ~1–3B params; sub-1B models (OPT-350M) can be interfaced but are too small to follow the format reliably.

Import an existing ecosystem

from sift.importers.openapi import register_openapi
from sift.importers.mcp import import_mcp_stdio, register_listing

register_openapi(sift, spec, category="acme")                    # OpenAPI 3.x
await import_mcp_stdio(sift, "npx", ["-y", "@modelcontextprotocol/server-github"],
                       category="integrations", service="github")  # MCP server

Each operation/tool becomes a node in the hierarchy — instantly searchable.

Per-model scoping (`allowedTools`) & response projection

Built for hubs like OpenWebUI: build the catalogue once, then give each model a scoped view of which tools it may see/run, and trim what each tool returns.

# pick tools for this model (globs over the dotted path); reuses the built index
view = sift.scope(allow=["google_workspace.gmail.*", "web.search.*"],
                  deny=["*.delete", "*.send"])
view.dispatch("search_tools", {"q": "read my last email"})  # only allowed tools
view.execute_tool("crm.contacts.delete", {})                # PermissionError (deny wins)

# trim a verbose tool's result so each call costs fewer tokens (great for MCPs):
sift.set_response("google_workspace.gmail.query",
                  transform=lambda r: {"ids": [m["id"] for m in r["messages"]]})
sift.set_response("google_workspace.gmail.read", returns=["id", "subject", "from"])

Idle cost: when a tool isn't used (the user just says "hi"), SIFT adds only the ~480-token fixed surface (system prompt + 3 meta-tool specs) — independent of catalogue size, and ~free across a conversation with prompt caching. A flat catalogue instead injects every schema each turn (~2.4k tokens at 25 tools, ~95k at 1,000).

Hybrid retrieval & reranking

Discovery fuses embeddings + BM25 with Reciprocal Rank Fusion (semantics + exact terms), and an optional cross-encoder reranker sharpens the final order:

sift = Sift(retrieval="hybrid")          # default; also "embedding" or "bm25"

from sift.rerank import FastEmbedReranker
sift = Sift(reranker=FastEmbedReranker())  # opt-in cross-encoder rerank

retrieval="bm25" needs no model download at all. Set a relevance floor so discovery returns nothing (an explicit "no matching tools") instead of the nearest-but-irrelevant tool when the catalogue doesn't cover the request:

sift = Sift(min_score=0.3)   # cosine floor (tune per embedding model)

Code mode (compose many tools in one turn)

Instead of one round-trip per tool, let the model write a snippet that orchestrates tools in a single turn (collapses multi-turn overhead):

tools  = sift.code_tools()          # search_tools + run_code
system = sift.code_system_prompt
# in the loop, run_code executes:  call(path, **params), search(q), schema(path)
sift.run_code("output = call('google_workspace.gmail.read', m=1)")

The snippet runs in a constrained namespace (no imports/file/eval). It is not a hardened sandbox — use code mode with trusted catalogues.

Evaluate

from sift.bench import Task, run_filter, token_report
print(token_report(sift.registry).format())     # TOON vs JSON token savings
print(run_filter(sift, tasks, top_k=3).format()) # filter-level metrics (no LLM cost)

from sift.evalsuite import Case, bfcl_style       # BFCL-style function-call accuracy
print(bfcl_style(call_model, sift.registry, cases).format())

from sift.agentbench import build_catalog, run_flat, run_sift  # SIFT vs flat baseline

Filter-level metrics (à la ToolMenuBench): gold next-tool exposure, no-visible-tool rate, average visible tools, MRR, risky-tool exposure, unauthorized risky exposure. (tau-bench's stateful environment is out of scope — it's an external harness.)

Schema format

A param is either the compact string "<type>:<req>:<default>:<description>" (req is n required / o optional) or the structured dict form when you need a default containing : (e.g. a Gmail is:unread query):

params={
    "m": "number:o:10:max results",                                  # compact
    "q": {"type": "string", "default": "is:unread", "desc": "query"},  # structured
}

returns is the response whitelist. risk=True flags high-impact actions (send/delete) — surfaced as |risk in TOON so the agent can confirm first.

Make imported tools runnable

Importers populate the hierarchy for discovery; bind an executor to also run them:

from sift.importers.openapi import register_openapi, httpx_request
register_openapi(sift, spec, category="acme",
                 request=httpx_request("https://api.acme.com"))

from sift.importers.mcp import register_listing
register_listing(sift, listing, category="integrations", service="github",
                 executor=lambda name, params: my_mcp_proxy(name, params))

For a live MCP server, connect_mcp_stdio launches it, registers its tools AND binds execution (keeps the session open) in one call:

from sift.importers import connect_mcp_stdio
proxy = connect_mcp_stdio(sift, "npx", ["-y", "@modelcontextprotocol/server-github"],
                          category="integrations", service="github")
sift.build_index()
# ... imported MCP tools now run out of the box ...
proxy.close()

Deploy as a server

Run SIFT as a standalone server so a hub (OpenWebUI, IDEs, …) connects to it, and you wire tools/MCPs/OpenAPI into SIFT — one hub for everything.

# OpenAPI HTTP server (OpenWebUI "tool server", REST clients)
python examples/serve_http.py            # OpenAPI at /openapi.json, docs at /docs

# MCP server
python examples/serve_mcp.py             # stdio (Claude Desktop)
python examples/serve_mcp.py sse         # HTTP/SSE (remote)

# Docker (OpenAPI server)
docker build -t sift-server .
docker run -p 8000:8000 -e SIFT_API_KEY=secret sift-server

Set SIFT_API_KEY to require Authorization: Bearer <key>. Pass a scope= to build_app / serve_http to expose only a subset of tools per server. Customize examples/serve_http.py with your own @sift.tools and importers.

OpenWebUI: add the server URL under Tools → OpenAPI tool server. (For MCP, bridge via mcpo or OpenWebUI's MCP support.) The model then sees just the 3 meta-tools and discovers your catalogue through them.

Repo layout

src/sift/            the Python library (the product)
  registry.py        hierarchy + navigation
  toon.py            TOON codec
  embeddings.py      local fastembed backend
  retrieval.py       BM25 + RRF (hybrid search)
  rerank.py          optional cross-encoder reranker
  gateway.py         the 3 meta-tools + hybrid search + filtering + cache
  scope.py           per-model allow/deny tool scoping (allowedTools)
  metatools.py       canonical tool specs + system prompt
  codemode.py        run_code: orchestrate tools in one turn (hardened sandbox)
  constrain.py       JSON schema / GBNF for constrained decoders
  http_server.py     OpenAPI HTTP tool server (serve_http)
  adapters/          openai · anthropic · langchain · mcp_server · prompted
  importers/         mcp · openapi · mcp_proxy (live MCP execution)
  bench.py           filter-level metrics + token report
  agentbench.py      SIFT vs flat-catalogue benchmark
  evalsuite.py       BFCL-style function-call accuracy
examples/            quickstart, live smokes, serve_http / serve_mcp
tests/               pytest suite (offline, deterministic)
.github/workflows/   CI (lint+test) and PyPI publish
Dockerfile           containerized OpenAPI server
core/   (reference)  a Go implementation of the same gateway (optional backend)

License

MIT.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

victoralves0

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0

Jun 30, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sift_tools-0.1.0.tar.gz (72.9 kB view details)

Uploaded Jun 30, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sift_tools-0.1.0-py3-none-any.whl (49.7 kB view details)

Uploaded Jun 30, 2026 Python 3

File details

Details for the file sift_tools-0.1.0.tar.gz.

File metadata

Download URL: sift_tools-0.1.0.tar.gz
Upload date: Jun 30, 2026
Size: 72.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for sift_tools-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`fef2e9d7cfd0953aeb6bcde1bb4535f0056f2a5411b116c8549a91a078b46839`
MD5	`3e3b18bd3b1ac802fae08a508c94a8fb`
BLAKE2b-256	`3eea748a528016f945fc4b2347daaa3ea1fb31bc68ef0bdb07305348e5210177`

See more details on using hashes here.

Provenance

The following attestation bundles were made for sift_tools-0.1.0.tar.gz:

Publisher: publish.yml on Victor-Alves0/SIFT

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: sift_tools-0.1.0.tar.gz
- Subject digest: fef2e9d7cfd0953aeb6bcde1bb4535f0056f2a5411b116c8549a91a078b46839
- Sigstore transparency entry: 2024171504
- Sigstore integration time: Jun 30, 2026
Source repository:
- Permalink: Victor-Alves0/SIFT@bdf41f086ad71a6d7d915270b204dc16c6b83ffe
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/Victor-Alves0
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@bdf41f086ad71a6d7d915270b204dc16c6b83ffe
- Trigger Event: push

File details

Details for the file sift_tools-0.1.0-py3-none-any.whl.

File metadata

Download URL: sift_tools-0.1.0-py3-none-any.whl
Upload date: Jun 30, 2026
Size: 49.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for sift_tools-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8d836b8ffc27dde3160a14f19a5d06dd6dc43374f9814c9f1aa679e3d83918b5`
MD5	`0c06ac062c3bb7bfb8bf6fef11d95bfc`
BLAKE2b-256	`64dbfe68a7b1756f538285cec3d6354c3132689f28ef9bb93be01bcd9cf25083`

See more details on using hashes here.

Provenance

The following attestation bundles were made for sift_tools-0.1.0-py3-none-any.whl:

Publisher: publish.yml on Victor-Alves0/SIFT

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: sift_tools-0.1.0-py3-none-any.whl
- Subject digest: 8d836b8ffc27dde3160a14f19a5d06dd6dc43374f9814c9f1aa679e3d83918b5
- Sigstore transparency entry: 2024171603
- Sigstore integration time: Jun 30, 2026
Source repository:
- Permalink: Victor-Alves0/SIFT@bdf41f086ad71a6d7d915270b204dc16c6b83ffe
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/Victor-Alves0
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@bdf41f086ad71a6d7d915270b204dc16c6b83ffe
- Trigger Event: push

sift-tools 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

SIFT — Search · Inspect · Filter · Trigger

Why

Install

Bring your own model (provider-agnostic)

Weak or no-tool-calling models (Llama 3B, base models, …)

Import an existing ecosystem

Per-model scoping (allowedTools) & response projection

Hybrid retrieval & reranking

Code mode (compose many tools in one turn)

Evaluate

Schema format

Make imported tools runnable

Deploy as a server

Repo layout

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

Per-model scoping (`allowedTools`) & response projection