Local-first CLI code intelligence tool with LangChain-powered RAG
Project description
CodeSage
Local-first code intelligence CLI. Search, analyze, and chat with your codebase using natural language — all running on your machine via Ollama.
Install
# macOS
brew install pipx && pipx ensurepath
# Linux
python3 -m pip install --user pipx && python3 -m pipx ensurepath
pipx install pycodesage
Optional features
# Tree-sitter AST analysis for JS, TS, Go, Rust (recommended)
pipx inject pycodesage "pycodesage[multi-language]"
# MCP server for Claude Desktop / Cursor / Windsurf
pipx inject pycodesage "pycodesage[mcp]"
# Both at once
pipx inject pycodesage "pycodesage[multi-language,mcp]"
Without multi-language, review checks for non-Python files still run but use text/regex heuristics instead of real AST parsing.
Requirements
Ollama must be running:
ollama pull qwen2.5-coder:7b # LLM
ollama pull nomic-embed-text # embeddings (fast, lightweight)
ollama serve
Alternatively, use any Ollama-compatible model. qwen3-embedding gives significantly better semantic search if you have the RAM.
Quick start
cd your-project
codesage init # detect languages, write .codesage/config.yaml
codesage index # parse files, build vector + graph index
codesage chat # ask questions about your code
codesage review # review uncommitted changes
Commands
codesage init
Detects languages in the project and creates .codesage/config.yaml. Safe to re-run — it won't overwrite an existing config.
codesage init
codesage init --model llama3.1 # use a different LLM
codesage init --embedding-model qwen3-embedding
codesage index
Parses source files, generates embeddings, and builds the call graph. Only processes changed files by default.
codesage index # incremental (changed files only)
codesage index --full # reindex everything from scratch
codesage index --clear # wipe the index first, then reindex
codesage index --no-learn # skip pattern learning
Index data lives in .codesage/ — SQLite for metadata, LanceDB for vectors, KuzuDB for the call graph.
codesage chat
Interactive session. Ask anything in plain English, or use slash commands:
/search <query> semantic code search with RRF fusion
/deep <query> multi-strategy deep analysis
/plan <task> generate an implementation plan
/similar <name> find similar functions/classes
/patterns [query] show learned patterns from this codebase
/review [file] review code changes with LLM
/security [path] security vulnerability scan
/impact <name> blast radius: who calls this, what breaks
/mode <mode> switch to brainstorm / implement / review
/context show or adjust context window settings
/stats index statistics
/export [file] save conversation to file
/clear clear chat history
/help show all commands
Natural language questions work too — you don't have to use slash commands.
codesage review
Reviews uncommitted changes. Combines static analysis, pattern deviation detection, and (in full mode) semantic similarity search.
codesage review # all uncommitted changes, fast mode
codesage review --staged # staged changes only (good for pre-commit)
codesage review --mode full # add semantic similarity + LLM synthesis
codesage review --severity warning # block on warnings, not just high+critical
codesage review --format json # JSON output for CI pipelines
codesage review --format sarif # SARIF for GitHub Advanced Security
codesage review --verbose # show timing and suppression details
codesage review path/to/subdir # limit to a subdirectory
What it checks:
| Category | Rules |
|---|---|
| Python static | Long functions, high complexity, deep nesting, too many params, god classes, missing return types, magic numbers |
| Rust/Go/JS/TS static | Cyclomatic complexity, long functions, deep nesting, param count, naming conventions (requires multi-language) |
| Security | Hardcoded secrets, SQL injection, eval/exec, unsafe deserialization, weak crypto, XSS sinks |
| Patterns | Deviations from your codebase's own learned patterns |
Suppress a finding inline: # codesage:ignore GEN-LONG-LINE or # codesage:ignore-next-line.
Suppress a file: add it to .codesageignore.
codesage hook
Installs a git pre-commit hook that runs codesage review --staged before each commit.
codesage hook install # install the hook
codesage hook uninstall # remove it
codesage hook status # check if installed
The hook blocks commits with findings at high severity or above. Bypass when needed: git commit --no-verify.
If you use the pre-commit framework instead, this repo includes a .pre-commit-hooks.yaml:
repos:
- repo: https://github.com/keshavashiya/codesage
rev: v0.3.1
hooks:
- id: codesage-review
codesage mcp
MCP server for AI IDE integration. Always runs in global mode — all projects you've indexed with codesage index are available through a single server.
codesage mcp serve # stdio (default, for IDE use)
codesage mcp serve -t sse -p 8080 # HTTP/SSE for multi-client setups
codesage mcp setup # print IDE config
codesage mcp test # smoke-test all tools
MCP Setup
Run codesage mcp setup to get the config, or add this to your IDE:
{
"mcpServers": {
"codesage": {
"command": "codesage",
"args": ["mcp", "serve"]
}
}
}
Available MCP tools (12)
| Tool | What it does |
|---|---|
list_projects |
List all indexed projects (global mode only) |
get_developer_profile |
Your coding style and learned patterns |
search_code |
Semantic search with confidence scoring |
get_file_context |
File content with definitions and security notes |
get_stats |
Index stats: files, elements, languages |
review_code |
Run a code review on a file or diff |
analyze_security |
Security vulnerability scan |
explain_concept |
How is X implemented in this codebase? |
suggest_approach |
Implementation guidance for a task |
trace_flow |
Callers and callees through the call graph |
find_examples |
Usage examples for a function or pattern |
recommend_pattern |
Patterns from your codebase's memory |
Configuration
Created by codesage init at .codesage/config.yaml. The most useful fields:
project_name: my-project
languages:
- python # auto-detected
llm:
provider: ollama
model: qwen2.5-coder:7b
embedding_model: nomic-embed-text
base_url: http://localhost:11434
exclude_dirs:
- node_modules
- venv
- .git
Full configuration reference
llm:
provider: ollama # ollama | openai | anthropic
model: qwen2.5-coder:7b
embedding_model: nomic-embed-text
base_url: http://localhost:11434
temperature: 0.3
max_tokens: 500
request_timeout: 30.0
storage:
vector_backend: lancedb
use_graph: true # enable call graph (KuzuDB)
security:
enabled: true
severity_threshold: medium
block_on_critical: true
memory:
enabled: true
learn_on_index: true # learn patterns during indexing
min_pattern_confidence: 0.5
performance:
embedding_batch_size: 200
embedding_cache_size: 1000
cache_enabled: true
Language support
| Language | Indexing | Static review | Call graph |
|---|---|---|---|
| Python | built-in | ✓ | ✓ |
| Rust | multi-language |
✓ AST-based | ✓ |
| Go | multi-language |
✓ AST-based | ✓ |
| TypeScript | multi-language |
✓ AST-based | ✓ |
| JavaScript | multi-language |
✓ AST-based | ✓ |
Install pycodesage[multi-language] for Rust/Go/JS/TS. Without it, those files are still indexed and reviewed using text/regex heuristics.
Using OpenAI or Anthropic instead of Ollama
pipx inject pycodesage "pycodesage[openai]"
# or
pipx inject pycodesage "pycodesage[anthropic]"
Then set in .codesage/config.yaml:
llm:
provider: openai
model: gpt-4o
Development
git clone https://github.com/keshavashiya/codesage.git
cd codesage
python3 -m venv venv && source venv/bin/activate
pip install -e ".[dev,multi-language,mcp]"
pytest tests/ -v
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pycodesage-0.3.2.tar.gz.
File metadata
- Download URL: pycodesage-0.3.2.tar.gz
- Upload date:
- Size: 230.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2c467f7a86ea524f96a05362a031a74870a2dd5236baa2853e0684ddefec44c9
|
|
| MD5 |
e19b68a30dc45d0764ecb1a5fede82c2
|
|
| BLAKE2b-256 |
87a13885f6bd2037c83e2f651318fc0d7eb0d00d7935c1aa12e0a3adb24a8be9
|
File details
Details for the file pycodesage-0.3.2-py3-none-any.whl.
File metadata
- Download URL: pycodesage-0.3.2-py3-none-any.whl
- Upload date:
- Size: 271.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8bdebcfa12325173997be21a60f2c19d4d2655801b90ac3e30fd296103c47af9
|
|
| MD5 |
dbede5bb8e82fe26bc99b2da84dc37c2
|
|
| BLAKE2b-256 |
690beb4147dec5e63132551cfba32b5c34fdcdef4b0627196ef6d786a48e4836
|