Local semantic search CLI tool for code and text files
Project description
lgrep
A local-first semantic search CLI tool for code and text files. Search your codebase using natural language queries powered by AI embeddings.
Installation
From PyPI
# Using pip
pip install lgrep-cli
# Using uv
uv pip install lgrep-cli
# With OpenAI support (optional)
pip install "lgrep-cli[openai]"
From Source
Using uv (recommended)
uv is a fast Python package manager written in Rust.
# Install uv if you don't have it
curl -LsSf https://astral.sh/uv/install.sh | sh
# Create virtual environment and install
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
uv pip install -e .
# With OpenAI support (optional)
uv pip install -e ".[openai]"
# With development tools
uv pip install -e ".[dev]"
Or use uv run to run without activating the venv:
uv run lgrep index .
uv run lgrep search "your query"
Using pip
# Install in editable mode
pip install -e .
# With OpenAI support (optional)
pip install -e ".[openai]"
# With development tools
pip install -e ".[dev]"
Using Docker
# Build the image
docker build -t lgrep .
# Run lgrep on your project (mount your code to /workspace)
docker run -v $(pwd):/workspace lgrep index .
docker run -v $(pwd):/workspace lgrep search "your query"
# Persist the index between runs
docker run -v $(pwd):/workspace -v lgrep-data:/workspace/.lgrep lgrep index .
docker run -v $(pwd):/workspace -v lgrep-data:/workspace/.lgrep lgrep search "database queries"
# Using OpenAI embeddings
docker run -v $(pwd):/workspace -e OPENAI_API_KEY=$OPENAI_API_KEY lgrep search "your query" --provider openai
Quick Start
# 1. Index your project
lgrep index .
# 2. Search semantically
lgrep search "database connection handling"
Multi-Repository Search
Search across multiple repositories (local folders or GitHub repos):
# Add repositories to the global index
lgrep repo add https://github.com/anthropics/anthropic-cookbook
lgrep repo add ~/Projects/my-project
# Search across all repositories
lgrep search "embeddings" --global
# Search a specific repository
lgrep search "authentication" --repo anthropic-cookbook
# Search only agent files (README.md, AGENTS.md, CLAUDE.md)
lgrep search "usage instructions" --agent-files
Commands
Index
Index a directory for semantic search:
# Index current directory
lgrep index
# Index a specific directory
lgrep index /path/to/project
# Clear existing index and reindex
lgrep index --clear
# Quiet mode (no progress output)
lgrep index --quiet
Search
Search for semantically similar content:
# Basic search
lgrep search "error handling in API routes"
# Limit results
lgrep search "authentication" --limit 5
# Set minimum similarity score (0-1)
lgrep search "logging" --min-score 0.7
# Show context lines before/after matches
lgrep search "database queries" --context 3
# List only matching files (no content)
lgrep search "config parsing" --files
# Filter by file pattern
lgrep search "tests" --file "test_*.py"
# Hide content snippets
lgrep search "imports" --no-content
Watch
Watch a directory and automatically reindex on changes:
# Watch current directory
lgrep watch
# Watch a specific directory
lgrep watch /path/to/project
# Press Ctrl+C to stop
Status
Show index statistics:
lgrep status
Config
Manage configuration:
# Initialize a new config file
lgrep config init
# Show current configuration
lgrep config show
# Show config file path
lgrep config path
Repo (Multi-Repository Management)
Manage repositories for global cross-repository search:
# Add a GitHub repository (clones and indexes automatically)
lgrep repo add https://github.com/owner/repo
# Add a local folder
lgrep repo add /path/to/project
# Add a local folder with GitHub URL metadata
lgrep repo add /path/to/project --url https://github.com/owner/repo
# Specify a branch for GitHub repos
lgrep repo add https://github.com/owner/repo --branch develop
# List all registered repositories
lgrep repo list
# Show detailed info about a repository (including agent files)
lgrep repo info my-repo
# Sync a repository (git pull for remote, re-index for all)
lgrep repo sync my-repo
# Sync all repositories
lgrep repo sync
# Remove a repository from the index
lgrep repo remove my-repo
Global Search Options
When searching across multiple repositories:
# Search all registered repositories
lgrep search "query" --global
# Search a specific repository by name or ID
lgrep search "query" --repo my-repo
# Search only agent files (README.md, AGENTS.md, CLAUDE.md)
lgrep search "query" --agent-files
# Combine filters
lgrep search "authentication" --repo my-repo --agent-files
Agent files are automatically detected and flagged during indexing. These are files commonly used by AI agents to understand a project:
README.md- Project documentationAGENTS.md- Agent-specific instructionsCLAUDE.md- Claude-specific instructions
Configuration
Configuration is stored in .lgrep/config.toml. Create one with lgrep config init or manually:
[embedding]
provider = "local" # "local" (fastembed), "sentence-transformers", or "openai"
model = "BAAI/bge-small-en-v1.5" # Default fastembed model
[embedding.openai]
api_key = "${OPENAI_API_KEY}"
model = "text-embedding-3-small"
[index]
chunk_size = 512
chunk_overlap = 50
include = ["**/*.py", "**/*.ts", "**/*.js", "**/*.md", "**/*.txt"]
exclude = ["node_modules", ".git", "__pycache__", ".venv", "venv"]
[search]
default_limit = 10
min_score = 0.5
Embedding Providers
Local (default) - FastEmbed
Uses FastEmbed with ONNX runtime. Lightweight (~50MB), no PyTorch/NVIDIA dependencies.
# Available models
BAAI/bge-small-en-v1.5 # 384 dims, ~50MB (default)
BAAI/bge-base-en-v1.5 # 768 dims, ~100MB
sentence-transformers/all-MiniLM-L6-v2 # 384 dims
Sentence-Transformers (optional)
For GPU acceleration or different models. Requires PyTorch.
# Install with sentence-transformers support
pip install -e ".[sentence-transformers]"
# Use it
lgrep index --provider sentence-transformers
lgrep search "query" --provider sentence-transformers
OpenAI API
For cloud-based embeddings:
# Set your API key
export OPENAI_API_KEY=your-key-here
# Use OpenAI provider
lgrep index --provider openai
lgrep search "your query" --provider openai
Or configure in .lgrep/config.toml:
[embedding]
provider = "openai"
Supported File Types
- Python:
.py - JavaScript/TypeScript:
.js,.ts,.tsx,.jsx - Go:
.go - Rust:
.rs - Java:
.java - C/C++:
.c,.cpp,.h,.hpp - Markdown:
.md - Text:
.txt - Config:
.json,.yaml,.yml,.toml - And more...
How It Works
- Indexing: Files are read, split into chunks (preserving line numbers), and converted to embeddings using FastEmbed (default), sentence-transformers, or OpenAI
- Storage: Embeddings are stored locally using ChromaDB in
.lgrep/index/ - Search: Your query is converted to an embedding and compared against stored embeddings using cosine similarity
- Results: Matching chunks are returned with file paths, line numbers, and similarity scores
Data Storage
Local Project Storage
Per-project data is stored in the .lgrep/ directory within your project:
your-project/
└── .lgrep/
├── config.toml # Configuration
├── index/ # ChromaDB vector store
└── cache/ # Embedding cache
Global Multi-Repository Storage
Multi-repository data is stored in ~/.lgrep/:
~/.lgrep/
├── repos.toml # Repository registry
├── repos/ # Cloned GitHub repositories
│ ├── a1b2c3d4/ # Repo ID (hash of URL)
│ └── ...
└── index/ # Global ChromaDB index (all repos)
Running Tests
# Using uv
uv pip install -e ".[dev]"
uv run pytest tests/
# Using pip
pip install -e ".[dev]"
pytest tests/
License
Apache 2.0
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lgrep_cli-0.1.1.tar.gz.
File metadata
- Download URL: lgrep_cli-0.1.1.tar.gz
- Upload date:
- Size: 291.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.21 {"installer":{"name":"uv","version":"0.9.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Arch Linux","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a5edd90cb1800279e0797aa21800eab78de1dbe4aaabb996072ec9fef86c3605
|
|
| MD5 |
dc2ee1a7ccff0612303d90a41d35c54e
|
|
| BLAKE2b-256 |
d9f8855285dd7c6bb191db8d27afdfc7d79c070b5ae353a82b063fcd6a7c12a9
|
File details
Details for the file lgrep_cli-0.1.1-py3-none-any.whl.
File metadata
- Download URL: lgrep_cli-0.1.1-py3-none-any.whl
- Upload date:
- Size: 48.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.21 {"installer":{"name":"uv","version":"0.9.21","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Arch Linux","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a8d7a77880d0f19c98775f2ebb65be67306ebcd3860948cbbca23f1620fbe1fb
|
|
| MD5 |
883aac61a89e0fade1e4815edef7a0f6
|
|
| BLAKE2b-256 |
6a8433414ffd25ce982723b71967d73612f7fe0a123bb6fc990311acbf6dd752
|