Reliability gateway for schema-stable, secret-safe, pagination-complete agent JSON.
Project description
Sift
Reliability gateway for AI tool output: schema-stable, secret-safe, pagination-complete JSON.
Sift is a drop-in reliability layer for MCP and CLI tool output. It persists full payloads as artifacts, returns either inline payload (full) or compact references (schema_ref), and lets agents query what they need with Python code over stored data.
Benchmark summary: on 103 factual questions across 12 real JSON datasets, Sift improved accuracy from 33.0% to 99.0% while cutting input tokens by 95.4% (10,757,230 -> 489,655). Full details: benchmarks/README.md.
How it works
┌─────────────────────┐
MCP tool call ──────────▶│ │──────────▶ Upstream MCP server
CLI command ──────────▶│ Sift │──────────▶ Shell/API command
│ │
│ ┌─────────────┐ │
│ │ Artifacts │ │
│ │ (SQLite) │ │
│ └─────────────┘ │
└─────────────────────┘
│
▼
Small output -> `full` inline
Large output -> `schema_ref`
Agent queries artifacts with code
Flow:
- Execute upstream tool/command and capture JSON.
- Persist full output as an artifact in SQLite and deterministically map schema/root hints.
- Return
full(small) orschema_ref(large/paginated). - Continue pages explicitly until
pagination.retrieval_status == COMPLETE. - Run focused Python queries on one artifact or the full pagination chain.
Main MCP pain points
These are recurring across MCP client issue trackers and protocol usage in production:
- Large tool definitions and large tool results consume context quickly.
- Upstream API pagination often sits outside MCP list-cursor flows, so agents can stop early and answer on partial data.
- Tool output shape differs across servers, which makes follow-up parsing brittle.
- Tool output is untrusted input and can contain sensitive values that should not re-enter model context.
- Raw outputs scroll away in chat history, so provenance and reproducibility degrade across multi-step runs.
Background and references: docs/why.md.
What Sift adds (without changing upstream servers)
- Artifact-backed outputs: keep full data out of prompt context while preserving it losslessly.
- Schema-aware references:
schema_refreturns query guidance for stable follow-up analysis. - Exact structured retrieval: run Python against stored artifacts instead of relying on prompt-sized payloads.
- Exact structured retrieval via
artifact(action="query", query_kind="code", ...)(MCP) orsift-gateway code(CLI). - Explicit pagination contract: continue with
artifact(action="next_page")orrun --continue-from. - Completion signaling: do not stop until
pagination.retrieval_status == COMPLETE. - Pagination-chain analysis: query one artifact or all related pages (
scope="all_related"; CLI default). - Outbound secret redaction enabled by default before output returns to the model.
MCP vs CLI positioning
- MCP: Sift is a reliability gateway for mirrored tool calls and artifact-based follow-up queries.
- CLI/OpenClaw: same artifact contract for command output (
sift-gateway run+sift-gateway code). - CLI pitfall: ad-hoc extraction can silently scope analysis to partial data (for example, inspecting only one row).
- CLI note: for one-off local extraction, plain
jqcan be enough. Sift is for repeatable, pagination-complete, policy-controlled workflows.
60-second quickstart
MCP clients
pipx install sift-gateway
sift-gateway init --from claude
Restart your MCP client, then use mirrored tools normally.
Supported --from shortcuts: claude, claude-code, cursor, vscode, windsurf, zed, auto, or an explicit config path.
CLI flow
# 1) Capture JSON output as an artifact
sift-gateway run --json -- kubectl get pods -A -o json
# 2) Query artifact data with Python
sift-gateway code --json <artifact_id> '$' --code "def run(data, schema, params): return {'rows': len(data)}"
Use $ when rows are at root. If nested, use metadata.usage.root_path from run --json (or metadata.queryable_roots in MCP schema_ref).
Pagination continuation
sift-gateway run --json --continue-from <artifact_id> -- <next-command-with-next-params-applied>
Do not claim completion until pagination.retrieval_status == COMPLETE.
Python codegen over all pages
For complex questions, generate Python once and run it over the entire pagination chain:
sift-gateway code --json --scope all_related <artifact_id> '$' --file ./analysis.py
CLI default is --scope all_related. Use --scope single for anchor-only analysis.
Benchmarks
Tier 1 result (claude-sonnet-4-6):
| Condition | Accuracy | Input Tokens |
|---|---|---|
| Baseline (context-stuffed) | 34/103 (33.0%) | 10,757,230 |
| Sift | 102/103 (99.0%) | 489,655 |
That is +66.0 points accuracy with 95.4% fewer input tokens on the same question set.
Methodology, scripts, and Tier 2 autonomous-agent results: benchmarks/README.md.
Documentation library
Start here: docs/README.md
Getting started
- Quick Start
- Installation
- Your first artifact (CLI)
- Your first artifact (MCP)
- Adding MCP servers after initial setup
- Troubleshooting
Core contracts
- API Contracts
- Mirrored Response Contract (
fullvsschema_ref) - Response Mode Selection
- Pagination Metadata
- Code Query Contract
- CLI output contract
- CLI default scope (
all_related)
Operations and security
- Deployment Guide
- Authentication tokens
- Outbound secret redaction
- Configuration Reference
- Code query runtime
- Error Contract
- Security policy
Patterns and deep dives
- Recipes
- Pagination chain (CLI)
- Pagination chain (MCP)
- Architecture
- Pagination model
- Observability
- Why Sift exists
- OpenClaw integration pack
- Upstream registration design
Security
See SECURITY.md for threat model and hardening guidance.
Development
git clone https://github.com/lourencomaciel/sift-gateway.git
cd sift-gateway
uv sync --extra dev
uv run python -m pytest tests/unit/ -q
Full contributor workflow: CONTRIBUTING.md
License
MIT - see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sift_gateway-0.4.3.tar.gz.
File metadata
- Download URL: sift_gateway-0.4.3.tar.gz
- Upload date:
- Size: 310.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
41f57905a23a9f994f9b8d72b13475c025893263217b7680c78fca84c1d2621d
|
|
| MD5 |
d0d4f64b65d8bbdf8071cfdeabe1f972
|
|
| BLAKE2b-256 |
6105c00b86c1167789f34dbb0302d2eb7efc937612629aadfb845260890d0363
|
Provenance
The following attestation bundles were made for sift_gateway-0.4.3.tar.gz:
Publisher:
release.yml on lourencomaciel/sift-gateway
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sift_gateway-0.4.3.tar.gz -
Subject digest:
41f57905a23a9f994f9b8d72b13475c025893263217b7680c78fca84c1d2621d - Sigstore transparency entry: 1055055113
- Sigstore integration time:
-
Permalink:
lourencomaciel/sift-gateway@e6b7c3877f2a83d574c772d72a1c5654ce689843 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/lourencomaciel
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@e6b7c3877f2a83d574c772d72a1c5654ce689843 -
Trigger Event:
push
-
Statement type:
File details
Details for the file sift_gateway-0.4.3-py3-none-any.whl.
File metadata
- Download URL: sift_gateway-0.4.3-py3-none-any.whl
- Upload date:
- Size: 393.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9cb8639c045b41178b4c96fd98f850582be5f23876a0d64fe57e0749290d37ac
|
|
| MD5 |
bab9e94d9d0c0535b5444ddcc8e22020
|
|
| BLAKE2b-256 |
bfd4060f548a9fcc18962a965ee8897185ac5257897e7cb06f0742a86c8a9381
|
Provenance
The following attestation bundles were made for sift_gateway-0.4.3-py3-none-any.whl:
Publisher:
release.yml on lourencomaciel/sift-gateway
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
sift_gateway-0.4.3-py3-none-any.whl -
Subject digest:
9cb8639c045b41178b4c96fd98f850582be5f23876a0d64fe57e0749290d37ac - Sigstore transparency entry: 1055055259
- Sigstore integration time:
-
Permalink:
lourencomaciel/sift-gateway@e6b7c3877f2a83d574c772d72a1c5654ce689843 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/lourencomaciel
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@e6b7c3877f2a83d574c772d72a1c5654ce689843 -
Trigger Event:
push
-
Statement type: