Skip to main content

Intelligent file indexing and search system

Project description

FileSift

FileSift

A local, open-source utility that helps AI coding agents intelligently search and understand codebases.

PyPI Python


FileSift lets your AI coding agent search across a codebase based on what code does, rather than what it looks like. Instead of sifting through entire files after a grep, your agent can jump straight to the most relevant code using natural language queries like "authentication middleware" or "database connection pooling". Everything runs locally on your machine — your code never leaves your environment.

Key benefits:

  • Smarter search — hybrid keyword + semantic search finds code by intent, not just string matching
  • Less context wasted — agents get pointed to the right files immediately, saving token budget on exploration

Installation

pip install filesift

Usage

There are three ways to use FileSift, depending on your workflow:

1. CLI

The most straightforward approach. Good for testing queries, managing indexes, and configuring settings.

# Index a project
filesift index /path/to/your/project

# Search for files by what they do
filesift find "authentication and session handling"

# Search in a specific directory
filesift find "retry logic for API calls" --path /path/to/project

2. MCP Server

Installing FileSift also provides a filesift-mcp command — a lightweight MCP server that exposes indexing and search as tools over STDIO. This works with most popular coding agents including Claude Code, Cursor, Copilot, and more.

Add it to your agent's MCP configuration:

{
  "mcpServers": {
    "filesift": {
      "command": "filesift-mcp"
    }
  }
}

The MCP server exposes four tools:

  • filesift_search — search an indexed codebase by natural language query
  • filesift_find_related — find files related to a given file via imports and semantic similarity
  • filesift_index — index a directory to enable searching
  • filesift_status — check indexing status of a directory

3. Skills

FileSift ships with a search-codebase skill that can be installed directly into your coding agent's skill directory. This lets the agent interact with the FileSift CLI through bash, without requiring MCP support.

# Install for Claude Code (default)
filesift skill install

# Install for other agents
filesift skill install --agent cursor
filesift skill install --agent copilot
filesift skill install --agent codex

Supported agents: claude, codex, cursor, copilot, gemini, roo, windsurf.

How It Works

FileSift uses a daemonized embedding model to keep searches fast. At its core, it generates embeddings from code descriptions and performs searches against small vector stores called indexes.

  1. Indexingfilesift index first builds a fast keyword/structural index (completes in seconds), then triggers background semantic indexing that generates embeddings for each file.

  2. Daemon — A background daemon loads indexes into memory and automatically shuts down after a configurable period of inactivity. After the first cold-start search, subsequent searches are near-instant.

  3. Search — Queries are matched using both keyword (BM25) and semantic (FAISS) search, then combined via Reciprocal Rank Fusion for the best of both approaches.

Indexes are stored in a .filesift directory within each indexed project.

Configuration

FileSift uses a TOML configuration file, manageable via the CLI:

# View all settings
filesift config list --all

# Set a value
filesift config set search.MAX_RESULTS 20
filesift config set daemon.INACTIVITY_TIMEOUT 600

# Manage ignore patterns
filesift config add-ignore "node_modules" ".venv"
filesift config list-ignore

Configuration sections: search, indexing, daemon, models, paths.

Contributing

Contributions are welcome! To get started:

git clone https://github.com/roshunsunder/filesift.git
cd filesift
pip install -e .
  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/my-feature)
  3. Commit your changes and open a pull request

License

Apache 2.0 — see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

filesift-0.2.1.tar.gz (47.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

filesift-0.2.1-py3-none-any.whl (54.4 kB view details)

Uploaded Python 3

File details

Details for the file filesift-0.2.1.tar.gz.

File metadata

  • Download URL: filesift-0.2.1.tar.gz
  • Upload date:
  • Size: 47.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for filesift-0.2.1.tar.gz
Algorithm Hash digest
SHA256 d01a9fe54b16dae14c442c88ff2b7bc438df02a48d85ffe63424134ef7b5d798
MD5 f46c547abfcd5882141a2488bab137e0
BLAKE2b-256 467a0f049bec89430daee1549a804f782f30522d99fba314b13cb6bc04cf401e

See more details on using hashes here.

File details

Details for the file filesift-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: filesift-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 54.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for filesift-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 0d61fc079c76026085ae91eacc6e351c38c6c231b67ab203f715467d8a0c2cd3
MD5 ec0d52aef31a1f0f8a2a00d249692fae
BLAKE2b-256 6e0e8b0d6cdf4cec664162dc83fc273738b7ac62b23e59799fa4088df8cb04b4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page