Skip to main content

Local-first codebase intelligence with semantic search, multi-hop research, and 12-language AST support

Project description

Sia Code

Local-first codebase search with semantic understanding and multi-hop code discovery.

Features

  • Semantic Search - Natural language queries with OpenAI embeddings (auto-fallback to lexical)
  • Multi-Hop Research - Automatically discover code relationships and call graphs
  • Natural Language Questions - Ask "How does X work?" and get relevant code
  • Project Auto-Detection - Automatic language detection and indexing strategy
  • Tiered Search - Filter by project code, dependencies, or both
  • 12 Languages - Python, JS/TS, Go, Rust, Java, C/C++, C#, Ruby, PHP (full AST support)
  • Interactive Mode - Live search with result navigation and export
  • Watch Mode - Auto-reindex on file changes
  • Portable - Single .mv2 file storage, no database required

Installation

# From PyPI (recommended)
pip install sia-code

# Or with uv
uv tool install sia-code

# Or from source
uv tool install git+https://github.com/DxTa/sia-code.git

# Try without installing (ephemeral run)
uvx sia-code --version
uvx sia-code search "authentication logic"

# Verify installation
sia-code --version

Quick Start

# Initialize and index
sia-code init
sia-code index .

# Search
sia-code search "authentication logic"           # Semantic search
sia-code search --regex "def.*login"             # Regex search

# Multi-hop research (discover relationships)
sia-code research "how does the API handle errors?"

# Check index health
sia-code status

Commands

Command Description
sia-code init Initialize index in current directory
sia-code init --dry-run Preview project analysis without indexing
sia-code index . Index codebase (first time)
sia-code index --update Re-index only changed files (10x faster)
sia-code index --clean Full rebuild from scratch
sia-code index --watch Auto-reindex on file changes
sia-code index --parallel Use parallel processing (100+ files)
sia-code search "query" Semantic or regex search
sia-code search --no-deps Exclude dependency code from results
sia-code search --deps-only Show only dependency code
sia-code research "question" Multi-hop code discovery
sia-code research --hops N Set max relationship depth (default: 2)
sia-code research --graph Show call graph visualization
sia-code interactive Live search mode with result navigation
sia-code status Index health and staleness metrics
sia-code compact Remove stale chunks when index grows
sia-code config show View configuration

Configuration

Semantic search requires OpenAI API key (optional):

export OPENAI_API_KEY=sk-your-key-here
sia-code init
sia-code index .

Without API key: Searches automatically fallback to lexical/regex mode. No crashes.

Edit config at .sia-code/config.json to:

  • Change embedding model (openai-small, openai-large, bge-small)
  • Exclude patterns (node_modules/, __pycache__/, etc.)
  • Adjust chunk sizes

View config: sia-code config show

Output Formats

sia-code search "query" --format json            # JSON output
sia-code search "query" --format table           # Rich table
sia-code search "query" --format csv             # CSV for Excel
sia-code search "query" --output results.json    # Save to file

Supported Languages

Full AST Support (12): Python, JavaScript, TypeScript, JSX, TSX, Go, Rust, Java, C, C++, C#, Ruby, PHP

Recognized: Kotlin, Groovy, Swift, Bash, Vue, Svelte, and more (indexed as text)

Troubleshooting

Issue Solution
No API key warning Normal - searches fallback to lexical mode
Index growing large Run sia-code compact to remove stale chunks
Slow indexing Use sia-code index --update for incremental
Stale search results Run sia-code index --clean to rebuild

How It Works

  1. Parse - Tree-sitter generates AST for each file
  2. Chunk - cAST algorithm creates semantic chunks (functions, classes)
  3. Embed - Optional OpenAI embeddings for semantic search
  4. Store - Single portable .mv2 file with Memvid
  5. Search - Hybrid BM25 + vector similarity

Links

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sia_code-0.3.0.tar.gz (55.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sia_code-0.3.0-py3-none-any.whl (57.1 kB view details)

Uploaded Python 3

File details

Details for the file sia_code-0.3.0.tar.gz.

File metadata

  • Download URL: sia_code-0.3.0.tar.gz
  • Upload date:
  • Size: 55.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sia_code-0.3.0.tar.gz
Algorithm Hash digest
SHA256 be0fe3c4ed2afdb092cbb3c53ba0727bd1787a50eb3f9db2d4bb2c138050c86d
MD5 f61267f950df232e87e502472b4577e8
BLAKE2b-256 ce41ab172adb741cad34381479ca9eb56414c4e8b1106d978ca3270a9c71474a

See more details on using hashes here.

Provenance

The following attestation bundles were made for sia_code-0.3.0.tar.gz:

Publisher: release.yml on DxTa/sia-code

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sia_code-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: sia_code-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 57.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sia_code-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 76c9b47a765f1fe44d2589e3a2369eccee656601cd68a56b133b2373b2f55a18
MD5 22cf867c0ba9b798b97b7095fba0dbbd
BLAKE2b-256 1c49a5ebe3bfe73a5467f5e49b5f83525d9af62c5503b16495c9e1e2b2ac835e

See more details on using hashes here.

Provenance

The following attestation bundles were made for sia_code-0.3.0-py3-none-any.whl:

Publisher: release.yml on DxTa/sia-code

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page