Add your description here
Project description
rootset
In-process code intelligence database for AI coding tools.
Parses repository structure with tree-sitter, resolves symbols via LSP, builds a call graph with rustworkx, and indexes everything into a single SQLite file. Exposes multiple retrieval strategies — full-text BM25, vector ANN, structural S-expression queries, graph traversal, and RRF hybrid fusion — through a single async API.
Supported languages: Python, TypeScript, JavaScript, Rust, Go.
Install
pip install rootset
Optional extras:
pip install rootset[voyage] # Voyage AI embeddings (voyage-code-3)
pip install rootset[openai] # OpenAI embeddings (text-embedding-3-small)
The default embedding provider is nomic-ai/nomic-embed-code via sentence-transformers, which runs locally with no API key.
Configuration
Settings is a plain dataclass — constructing it explicitly never reads env vars:
from rootset import Settings
settings = Settings(db_path="/path/to/index.db")
To load from ROOTSET_* environment variables or a .env file, use get_settings():
from rootset import get_settings
settings = get_settings() # reads env, result is cached
Key settings:
| Field | Default | Description |
|---|---|---|
db_path |
rootset.db |
Path to the SQLite index file |
embedding_provider |
local |
local, voyage, or openai |
voyage_api_key |
None |
Activates Voyage provider if set |
openai_api_key |
None |
Activates OpenAI provider if set |
reindex_debounce_seconds |
1.5 |
Debounce delay for notify_file_changed |
Usage
import asyncio
from pathlib import Path
from rootset import Repository, SearchMode, Settings
async def main():
repo = Repository(Settings(db_path="myindex.db"))
async with repo:
# Index a repository
stats = await repo.index("/path/to/project")
print(stats) # IndexStats(files_indexed=..., symbols_indexed=..., ...)
# Search
results = await repo.search("function that builds call graph")
for r in results:
print(r.symbol.qualified_name, r.score)
# Explicit search mode
results = await repo.search("build", mode=SearchMode.TEXT)
results = await repo.search("build", mode=SearchMode.SEMANTIC)
results = await repo.search("build", mode=SearchMode.HYBRID)
# Symbol lookup
sym = await repo.get_symbol("mymodule.MyClass.my_method")
# Call graph traversal
callers = await repo.find_callers("GraphIndexer.build", depth=2)
callees = await repo.find_callees("GraphIndexer.build", depth=1)
# Rich context for a symbol
ctx = await repo.get_context("Repository.index")
print(ctx.callers, ctx.callees, ctx.import_chain)
# S-expression structural query
symbols = await repo.structural_query(
"(function_definition name: (identifier) @name)", language="python"
)
# LLM reranking over hybrid results
results = await repo.reasoning_search("parse and index source files")
asyncio.run(main())
Incremental reindex
After writing a file, notify the repository to schedule a debounced re-index:
# synchronous — safe to call from a write_file tool handler
repo.notify_file_changed("/path/to/project/src/foo.py")
Rapid successive calls for the same path cancel the previous pending reindex; only one fires after reindex_debounce_seconds.
Public API
Repository
| Method | Returns | Description |
|---|---|---|
index(path, *, incremental=True) |
IndexStats |
Index a repository. Skips unchanged files when incremental=True. |
search(query, *, top_k, mode) |
list[SearchResult] |
Search indexed symbols. Default mode is HYBRID. |
reasoning_search(query, *, top_k) |
list[SearchResult] |
Hybrid search followed by LLM reranking. |
get_symbol(qualified_name) |
Symbol | None |
Exact qualified-name lookup. |
find_callers(symbol_name, depth) |
list[Symbol] |
BFS inbound from call graph. |
find_callees(symbol_name, depth) |
list[Symbol] |
BFS outbound from call graph. |
get_context(symbol_name) |
CodeContext |
Callers, callees, related symbols, import chain. |
structural_query(pattern, language) |
list[Symbol] |
Tree-sitter S-expression query across all indexed files. |
notify_file_changed(path) |
None |
Schedule debounced single-file reindex (synchronous). |
SearchMode
TEXT · SEMANTIC · STRUCTURAL · GRAPH · HYBRID · REASONING
Return types
Symbol—id,qualified_name,kind(SymbolKind),line_start,line_end,signature,docstring,contentSearchResult—symbol,score,search_type,explanationCodeContext—symbol,definition_file,callers,callees,related_symbols,import_chainIndexStats—files_indexed,symbols_indexed,call_edges_indexed,import_edges_indexed,files_skipped
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rootset-0.2.0.tar.gz.
File metadata
- Download URL: rootset-0.2.0.tar.gz
- Upload date:
- Size: 377.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.17
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
909e910a52447c215fca712ee6167ef848da1d10f7bca9067122a4401a80b55f
|
|
| MD5 |
86ebc7b381ec06683348ef9c9a23d161
|
|
| BLAKE2b-256 |
3a30d522b84804d03816cc79bc1510ad67492954d04bffd13362171dc11f2e9f
|
File details
Details for the file rootset-0.2.0-py3-none-any.whl.
File metadata
- Download URL: rootset-0.2.0-py3-none-any.whl
- Upload date:
- Size: 36.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.17
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
01f6881b383951c159555c27f0d8c26e58533898e7208a85f14114ddde3304f6
|
|
| MD5 |
3a4247e66277b8142100be8c6fb5ef85
|
|
| BLAKE2b-256 |
76c219254c815595209caafcca024fd2ff08d694c508261b4603c67512cd6a97
|