Skip to main content

Local-first codebase intelligence with semantic search, multi-hop research, and 12-language AST support

Project description

Sia Code

v0.2 - Local-first codebase search with semantic understanding and multi-hop code discovery.

Features

  • Semantic Search - Natural language queries with OpenAI embeddings (auto-fallback to lexical)
  • Multi-Hop Research - Automatically discover code relationships and call graphs
  • 12 Languages - Python, JS/TS, Go, Rust, Java, C/C++, C#, Ruby, PHP (full AST support)
  • Interactive Mode - Live search with result navigation and export
  • Watch Mode - Auto-reindex on file changes
  • Portable - Single .mv2 file storage, no database required

Installation

# From PyPI (recommended)
pip install sia-code

# Or with uv
uv tool install sia-code

# Or from source
uv tool install git+https://github.com/DxTa/sia-code.git

# Verify installation
sia-code --version

Quick Start

# Initialize and index
sia-code init
sia-code index .

# Search
sia-code search "authentication logic"           # Semantic search
sia-code search --regex "def.*login"             # Regex search

# Multi-hop research (discover relationships)
sia-code research "how does the API handle errors?"

# Check index health
sia-code status

Commands

Command Description
sia-code init Initialize index in current directory
sia-code index . Index codebase (first time)
sia-code index --update Re-index only changed files (10x faster)
sia-code index --clean Full rebuild from scratch
sia-code index --watch Auto-reindex on file changes
sia-code search "query" Semantic or regex search
sia-code research "question" Multi-hop code discovery with --graph
sia-code interactive Live search mode with result navigation
sia-code status Index health and staleness metrics
sia-code compact Remove stale chunks when index grows
sia-code config show View configuration

Configuration

Semantic search requires OpenAI API key (optional):

export OPENAI_API_KEY=sk-your-key-here
sia-code init
sia-code index .

Without API key: Searches automatically fallback to lexical/regex mode. No crashes.

Edit config at .sia-code/config.json to:

  • Change embedding model (openai-small, openai-large, bge-small)
  • Exclude patterns (node_modules/, __pycache__/, etc.)
  • Adjust chunk sizes

View config: sia-code config show

Output Formats

sia-code search "query" --format json            # JSON output
sia-code search "query" --format table           # Rich table
sia-code search "query" --format csv             # CSV for Excel
sia-code search "query" --output results.json    # Save to file

Supported Languages

Full AST Support (12): Python, JavaScript, TypeScript, JSX, TSX, Go, Rust, Java, C, C++, C#, Ruby, PHP

Recognized: Kotlin, Groovy, Swift, Bash, Vue, Svelte, and more (indexed as text)

Troubleshooting

Issue Solution
No API key warning Normal - searches fallback to lexical mode
Index growing large Run sia-code compact to remove stale chunks
Slow indexing Use sia-code index --update for incremental
Stale search results Run sia-code index --clean to rebuild

How It Works

  1. Parse - Tree-sitter generates AST for each file
  2. Chunk - cAST algorithm creates semantic chunks (functions, classes)
  3. Embed - Optional OpenAI embeddings for semantic search
  4. Store - Single portable .mv2 file with Memvid
  5. Search - Hybrid BM25 + vector similarity

Links

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sia_code-0.2.1.tar.gz (35.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sia_code-0.2.1-py3-none-any.whl (38.8 kB view details)

Uploaded Python 3

File details

Details for the file sia_code-0.2.1.tar.gz.

File metadata

  • Download URL: sia_code-0.2.1.tar.gz
  • Upload date:
  • Size: 35.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sia_code-0.2.1.tar.gz
Algorithm Hash digest
SHA256 fd768047207a7db1663bc518d90ae47b105e935774ea8d02e70d1e7e0e17ab9e
MD5 50cbf4884d9c82229c922e064f2d77d2
BLAKE2b-256 23ba952b66fda013e59c38831ab04b92a39359e6effd755c2ad62a8ae670989c

See more details on using hashes here.

Provenance

The following attestation bundles were made for sia_code-0.2.1.tar.gz:

Publisher: release.yml on DxTa/sia-code

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file sia_code-0.2.1-py3-none-any.whl.

File metadata

  • Download URL: sia_code-0.2.1-py3-none-any.whl
  • Upload date:
  • Size: 38.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sia_code-0.2.1-py3-none-any.whl
Algorithm Hash digest
SHA256 a41fc5260889aad6877962e5a8c0b7462d1f744da0cc3529ffd4e37a48531372
MD5 4ea468e7277347d71899904d953d1300
BLAKE2b-256 ccca4a84842d713d0e0fb69cf9dc9d639fe80aac81ff04cb650b4065fb02e2dd

See more details on using hashes here.

Provenance

The following attestation bundles were made for sia_code-0.2.1-py3-none-any.whl:

Publisher: release.yml on DxTa/sia-code

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page