Keep big MCP responses out of your context window. Query them.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

zmaciel

These details have not been verified by PyPI

Project description

Sift

Reliability gateway - Schema-stable, secret-safe, pagination-complete JSON for AI agents.

Sift is built for workflows where data correctness matters as much as model quality: enterprise automation, research pipelines, and long-running agent sessions.

For one-off CLI tasks, plain jq or Python can be enough. Sift adds value when you need guarantees: consistent schema handling, secret redaction before data re-enters model context, explicit pagination continuation, and an auditable artifact history.

Sift stores tool output as artifacts, infers schema metadata, and returns either inline payload (full) or an artifact reference (schema_ref). In schema_ref, Sift returns either a representative sample_item preview or verbose schemas fallback.

Keeping large payloads out of prompt context is still a core benefit, but it is one outcome of these guarantees rather than the only goal. See Why Sift exists for research and ecosystem context.

Sift works with MCP clients (Claude Desktop, Claude Code, Cursor, VS Code, Windsurf, Zed) and CLI agents (OpenClaw, terminal automation). Same artifact store, same query interface, two entry points.

                           ┌─────────────────────┐
  MCP tool call ──────────▶│                     │──────────▶ Upstream MCP Server
  CLI command   ──────────▶│        Sift         │──────────▶ Shell command
                           │                     │
                           │   ┌─────────────┐   │
                           │   │  Artifacts  │   │
                           │   │  (SQLite)   │   │
                           │   └─────────────┘   │
                           └─────────────────────┘
                                     │
                                     ▼
                           Small output? return inline
                           Large output? return schema reference
                           Agent queries what it needs via code

Quick start

MCP agents

pipx install sift-gateway
sift-gateway init --from claude

Restart your MCP client. Sift mirrors upstream tools, persists outputs as artifacts, and returns either the full payload (for small responses) or a schema reference (for large responses). The agent can query stored artifacts with artifact(action="query", query_kind="code", ...).

--from shortcuts: claude, claude-code, cursor, vscode, windsurf, zed, auto, or an explicit path.

CLI agents (OpenClaw, terminal automation)

pipx install sift-gateway
sift-gateway run -- kubectl get pods -A -o json

Use this flow when you need reproducibility and policy controls on command output (not just ad-hoc extraction). Large output is stored and returned as an artifact ID plus schema_ref metadata. Example:

sift-gateway code <artifact_id> '$.items' --code "def run(data, schema, params): return {'rows': len(data)}"

Another capture example:

sift-gateway run -- curl -s api.example.com/events

For OpenClaw, see the OpenClaw Integration Pack.

Example workflow

You ask an agent to check what is failing in prod:

datadog.list_monitors(tag="service:payments")

Without Sift, 70 KB of monitor configs and metadata can go straight into context. That is about 18,000 tokens before the next tool call.

With Sift, the agent gets a schema reference:

{
  "response_mode": "schema_ref",
  "artifact_id": "art_9b2c...",
  "sample_item": {
    "name": "Payments monitor",
    "status": "Alert",
    "type": "query alert"
  },
  "sample_item_source_index": 0,
  "sample_item_count": 120
}

If a representative sample is not valid for the result set, schema_ref falls back to schemas.

The agent can then run a focused query:

artifact(
    action="query",
    query_kind="code",
    artifact_id="art_9b2c...",
    root_path="$.monitors",
    code="def run(data, schema, params): return [m for m in data if m.get('status') == 'Alert']",
)

In this example, two calls use about 400 tokens and still leave room for follow-up steps.

How it works

Sift runs one processing pipeline for MCP and CLI:

Execute the tool call or command.
Parse JSON output.
Detect pagination from the raw response.
Redact sensitive values (enabled by default).
Persist the artifact to SQLite.
Map the schema (field types, sample values, cardinality).
Choose response mode: full (inline) or schema_ref (sample preview or schema fallback).
Return the artifact-centric response.

Response mode selection

Sift chooses between inline and reference automatically:

If the response has upstream pagination: always schema_ref.
If the full response exceeds the configured cap (default 8 KB): schema_ref.
If the schema reference is at least 50% smaller than full: schema_ref.
Otherwise: full (inline payload).

Pagination

When upstream tools or APIs paginate, Sift handles continuation explicitly.

MCP:

artifact(action="next_page", artifact_id="art_9b2c...")

CLI:

sift-gateway run --continue-from art_9b2c... -- gh api repos/org/repo/pulls --after NEXT_CURSOR

Each page creates a new artifact linked to the previous one through lineage metadata. The agent can run code queries across the full chain.

Code queries

Both MCP and CLI agents can analyze stored artifacts with Python.

MCP:

artifact(
    action="query",
    query_kind="code",
    artifact_id="art_123",
    root_path="$.items",
    code="def run(data, schema, params): return {'count': len(data)}",
)

CLI:

# Function mode
sift-gateway code art_123 '$.items' --code "def run(data, schema, params): return {'count': len(data)}"

# File mode
sift-gateway code art_123 '$.items' --file ./analysis.py

Multi-artifact query example:

artifact(
    action="query",
    query_kind="code",
    artifact_ids=["art_users", "art_orders"],
    root_paths={"art_users": "$.users", "art_orders": "$.orders"},
    code="""
def run(artifacts, schemas, params):
    users = {u["id"]: u["name"] for u in artifacts["art_users"]}
    return [{"user": users.get(o["user_id"]), "amount": o["amount"]}
            for o in artifacts["art_orders"]]
""",
)

Import allowlist

Code queries run with a configurable import allowlist. Default allowed import roots include math, json, re, collections, statistics, heapq, numpy, pandas, jmespath, datetime, itertools, functools, operator, decimal, csv, io, string, textwrap, copy, typing, dataclasses, enum, fractions, bisect, random, base64, and urllib.parse. Third-party modules are usable only when installed in Sift's runtime environment.

Install additional packages:

sift-gateway install scipy matplotlib

Security

Code queries use AST validation, an import allowlist, timeout enforcement, and memory limits. This is not a full OS-level sandbox.

Outbound secret redaction is enabled by default to reduce accidental leakage of API keys from upstream tool responses.

See SECURITY.md for the full security policy.

Configuration

Env var	Default	Description
`SIFT_GATEWAY_DATA_DIR`	`.sift-gateway`	Root data directory
`SIFT_GATEWAY_PASSTHROUGH_MAX_BYTES`	`8192`	Inline response cap
`SIFT_GATEWAY_SECRET_REDACTION_ENABLED`	`true`	Redact secrets from tool output
`SIFT_GATEWAY_AUTH_TOKEN`	unset	Required for non-local HTTP binds

Full reference: docs/config.md

Documentation

Doc	Covers
Why Sift Exists	Research and ecosystem context
Quick Start	Install, init, first artifact
Recipes	Practical usage patterns
OpenClaw Pack	OpenClaw skill, quickstart, templates
API Contracts	MCP + CLI public contract
Configuration	All settings and env vars
Deployment	Transport modes, auth, ops
Errors	Error codes and troubleshooting
Observability	Structured logging and metrics
Architecture	Design and invariants

Benchmarks

The Tier 1 benchmark compares Sift's schema_ref + codegen approach against naive full-JSON context stuffing across 12 real-world datasets and 103 factual questions.

Model	Condition	Accuracy	Input Tokens	Token Reduction
claude-sonnet-4-6	Baseline	31/103 (30.1%)	10,757,230	—
claude-sonnet-4-6	Sift	99/103 (96.1%)	501,639	95.3%
claude-opus-4-6	Baseline	34/103 (33.0%)	10,757,230	—
claude-opus-4-6	Sift	98/103 (95.1%)	508,496	95.3%

Sift achieves ~96% accuracy on both models while using ~95% fewer input tokens. The baseline struggles on large and deeply nested datasets (earthquakes, laureates, products) where payloads crowd the context window, while Sift handles them through artifact queries.

python benchmarks/tier1/fetch_data.py
python benchmarks/tier1/harness.py --model claude-sonnet-4-6

# or with uv (recommended for clean clones)
uv run python benchmarks/tier1/fetch_data.py
uv run python benchmarks/tier1/harness.py --model claude-sonnet-4-6

See benchmarks/tier1/ for the full suite and per-dataset breakdown.

Development

git clone https://github.com/lourencomaciel/sift-gateway.git
cd sift-gateway
uv sync --extra dev

uv run python -m pytest tests/unit/ -q
uv run python -m ruff check src tests
uv run python -m mypy src

See CONTRIBUTING.md for the full development guide.

License

MIT - see LICENSE.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

zmaciel

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.4.4

Mar 28, 2026

0.4.3

Mar 7, 2026

0.4.2

Mar 5, 2026

0.4.1

Mar 5, 2026

0.4.0

Mar 4, 2026

0.3.9

Mar 3, 2026

0.3.8

Mar 3, 2026

0.3.7

Mar 3, 2026

0.3.6

Mar 3, 2026

0.3.5

Mar 3, 2026

This version

0.3.4

Mar 3, 2026

0.3.3

Mar 2, 2026

0.3.2

Mar 2, 2026

0.3.1

Feb 27, 2026

0.3.0

Feb 21, 2026

0.2.8

Feb 20, 2026

0.2.7

Feb 20, 2026

0.2.6

Feb 20, 2026

0.2.5

Feb 20, 2026

0.2.4

Feb 19, 2026

0.2.3

Feb 19, 2026

0.2.2

Feb 19, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sift_gateway-0.3.4.tar.gz (278.5 kB view details)

Uploaded Mar 3, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sift_gateway-0.3.4-py3-none-any.whl (358.2 kB view details)

Uploaded Mar 3, 2026 Python 3

File details

Details for the file sift_gateway-0.3.4.tar.gz.

File metadata

Download URL: sift_gateway-0.3.4.tar.gz
Upload date: Mar 3, 2026
Size: 278.5 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sift_gateway-0.3.4.tar.gz
Algorithm	Hash digest
SHA256	`d3bbb3fd1f248bc6fe0e3dc522da00434df6e325595fc6a64fa505464811e677`
MD5	`97811da391b2261aed6455778676e961`
BLAKE2b-256	`600c6646c6e5c4dd402a781a5821523a07ab9b7935e43907c77c0324ce9b1184`

See more details on using hashes here.

Provenance

The following attestation bundles were made for sift_gateway-0.3.4.tar.gz:

Publisher: release.yml on lourencomaciel/sift-gateway

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: sift_gateway-0.3.4.tar.gz
- Subject digest: d3bbb3fd1f248bc6fe0e3dc522da00434df6e325595fc6a64fa505464811e677
- Sigstore transparency entry: 1013316196
- Sigstore integration time: Mar 3, 2026
Source repository:
- Permalink: lourencomaciel/sift-gateway@2311966c56afe20b80bdcfde2d017ed1e7260960
- Branch / Tag: refs/tags/v0.3.4
- Owner: https://github.com/lourencomaciel
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@2311966c56afe20b80bdcfde2d017ed1e7260960
- Trigger Event: push

File details

Details for the file sift_gateway-0.3.4-py3-none-any.whl.

File metadata

Download URL: sift_gateway-0.3.4-py3-none-any.whl
Upload date: Mar 3, 2026
Size: 358.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sift_gateway-0.3.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4638cdb1b2360a10ccf39c17706c3b92fe91ec93e2f073f2cab67a07b71205a3`
MD5	`9c9d22b5d84ac367cfab6fd293a87d03`
BLAKE2b-256	`8fbf4aab370c1736958637db1e764f79699a755e40644878210f52de91695065`

See more details on using hashes here.

Provenance

The following attestation bundles were made for sift_gateway-0.3.4-py3-none-any.whl:

Publisher: release.yml on lourencomaciel/sift-gateway

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: sift_gateway-0.3.4-py3-none-any.whl
- Subject digest: 4638cdb1b2360a10ccf39c17706c3b92fe91ec93e2f073f2cab67a07b71205a3
- Sigstore transparency entry: 1013316237
- Sigstore integration time: Mar 3, 2026
Source repository:
- Permalink: lourencomaciel/sift-gateway@2311966c56afe20b80bdcfde2d017ed1e7260960
- Branch / Tag: refs/tags/v0.3.4
- Owner: https://github.com/lourencomaciel
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@2311966c56afe20b80bdcfde2d017ed1e7260960
- Trigger Event: push

sift-gateway 0.3.4

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Sift

Quick start

MCP agents

CLI agents (OpenClaw, terminal automation)

Example workflow

How it works

Response mode selection

Pagination

Code queries

Import allowlist

Security

Configuration

Documentation

Benchmarks

Development

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance