Skip to main content

Add your description here

Project description

rootset

In-process code intelligence database for AI coding tools.

Parses repository structure with tree-sitter, resolves symbols via LSP, builds a call graph with rustworkx, and indexes everything into a single SQLite file. Exposes multiple retrieval strategies — full-text BM25, vector ANN, structural S-expression queries, graph traversal, and RRF hybrid fusion — through a single async API.

Supported languages: Python, TypeScript, JavaScript, Rust, Go.


Install

pip install rootset

Optional extras:

pip install rootset[voyage]   # Voyage AI embeddings (voyage-code-3)
pip install rootset[openai]   # OpenAI embeddings (text-embedding-3-small)

The default embedding provider is nomic-ai/nomic-embed-code via sentence-transformers, which runs locally with no API key.


Configuration

Settings is a plain dataclass — constructing it explicitly never reads env vars:

from rootset import Settings

settings = Settings(db_path="/path/to/index.db")

To load from ROOTSET_* environment variables or a .env file, use get_settings():

from rootset import get_settings

settings = get_settings()  # reads env, result is cached

Key settings:

Field Default Description
db_path rootset.db Path to the SQLite index file
embedding_provider local local, voyage, or openai
voyage_api_key None Activates Voyage provider if set
openai_api_key None Activates OpenAI provider if set
reindex_debounce_seconds 1.5 Debounce delay for notify_file_changed

Usage

import asyncio
from pathlib import Path
from rootset import Repository, SearchMode, Settings

async def main():
    repo = Repository(Settings(db_path="myindex.db"))

    async with repo:
        # Index a repository
        stats = await repo.index("/path/to/project")
        print(stats)  # IndexStats(files_indexed=..., symbols_indexed=..., ...)

        # Search
        results = await repo.search("function that builds call graph")
        for r in results:
            print(r.symbol.qualified_name, r.score)

        # Explicit search mode
        results = await repo.search("build", mode=SearchMode.TEXT)
        results = await repo.search("build", mode=SearchMode.SEMANTIC)
        results = await repo.search("build", mode=SearchMode.HYBRID)

        # Symbol lookup
        sym = await repo.get_symbol("mymodule.MyClass.my_method")

        # Call graph traversal
        callers = await repo.find_callers("GraphIndexer.build", depth=2)
        callees = await repo.find_callees("GraphIndexer.build", depth=1)

        # Rich context for a symbol
        ctx = await repo.get_context("Repository.index")
        print(ctx.callers, ctx.callees, ctx.import_chain)

        # S-expression structural query
        symbols = await repo.structural_query(
            "(function_definition name: (identifier) @name)", language="python"
        )

        # LLM reranking over hybrid results
        results = await repo.reasoning_search("parse and index source files")

asyncio.run(main())

Incremental reindex

After writing a file, notify the repository to schedule a debounced re-index:

# synchronous — safe to call from a write_file tool handler
repo.notify_file_changed("/path/to/project/src/foo.py")

Rapid successive calls for the same path cancel the previous pending reindex; only one fires after reindex_debounce_seconds.


Public API

Repository

Method Returns Description
index(path, *, incremental=True) IndexStats Index a repository. Skips unchanged files when incremental=True.
search(query, *, top_k, mode) list[SearchResult] Search indexed symbols. Default mode is HYBRID.
reasoning_search(query, *, top_k) list[SearchResult] Hybrid search followed by LLM reranking.
get_symbol(qualified_name) Symbol | None Exact qualified-name lookup.
find_callers(symbol_name, depth) list[Symbol] BFS inbound from call graph.
find_callees(symbol_name, depth) list[Symbol] BFS outbound from call graph.
get_context(symbol_name) CodeContext Callers, callees, related symbols, import chain.
structural_query(pattern, language) list[Symbol] Tree-sitter S-expression query across all indexed files.
notify_file_changed(path) None Schedule debounced single-file reindex (synchronous).

SearchMode

TEXT · SEMANTIC · STRUCTURAL · GRAPH · HYBRID · REASONING

Return types

  • Symbolid, qualified_name, kind (SymbolKind), line_start, line_end, signature, docstring, content
  • SearchResultsymbol, score, search_type, explanation
  • CodeContextsymbol, definition_file, callers, callees, related_symbols, import_chain
  • IndexStatsfiles_indexed, symbols_indexed, call_edges_indexed, import_edges_indexed, files_skipped

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rootset-0.2.0.tar.gz (377.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rootset-0.2.0-py3-none-any.whl (36.5 kB view details)

Uploaded Python 3

File details

Details for the file rootset-0.2.0.tar.gz.

File metadata

  • Download URL: rootset-0.2.0.tar.gz
  • Upload date:
  • Size: 377.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.17

File hashes

Hashes for rootset-0.2.0.tar.gz
Algorithm Hash digest
SHA256 909e910a52447c215fca712ee6167ef848da1d10f7bca9067122a4401a80b55f
MD5 86ebc7b381ec06683348ef9c9a23d161
BLAKE2b-256 3a30d522b84804d03816cc79bc1510ad67492954d04bffd13362171dc11f2e9f

See more details on using hashes here.

File details

Details for the file rootset-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: rootset-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 36.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.17

File hashes

Hashes for rootset-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 01f6881b383951c159555c27f0d8c26e58533898e7208a85f14114ddde3304f6
MD5 3a4247e66277b8142100be8c6fb5ef85
BLAKE2b-256 76c219254c815595209caafcca024fd2ff08d694c508261b4603c67512cd6a97

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page