Skip to main content

Local semantic search CLI tool for codebases using embeddings

Project description

Odino: Local Semantic Search CLI

A fast local semantic search tool that helps you find code using natural language queries. No internet required, everything runs locally using the embeddinggemma-300m model.

Quick Start

Install Odino directly from PyPI:

pip install odino

Or install from source:

git clone https://github.com/cesp99/odino.git
cd odino
pip install -e .

For detailed installation instructions, including uninstallation and troubleshooting, see INSTALL.md.

Usage

Index your codebase

# Index current directory
odino index .

# Index specific directory
odino index /path/to/project

# Index with custom model (optional)
odino index /path/to/project --model <your-own-model>

Search your code

# Basic search (returns 2 results by default)
odino -q "function that handles user authentication"

# Search with custom number of results
odino -q "database connection" -r 10

# Search specific file types
odino -q "error handling" --include "*.py"

Check status

odino status

Examples

Find authentication code:

odino -q "user login function"

Search for database queries:

odino -q "sql select statement" --include "*.sql"

Find error handling patterns:

odino -q "try catch exception handling"

Project Structure

odino/
├── odino/
│   ├── __init__.py
│   ├── cli.py              # CLI entry point
│   ├── indexer.py          # File indexing logic
│   ├── searcher.py         # Semantic search implementation
│   └── utils.py            # Utility functions
├── pyproject.toml          # Project configuration
├── README.md              # This file
└── .odinoignore           # Default ignore patterns

Configuration

Odino creates a .odino/ directory in your project root with:

  • config.json - Configuration settings
  • chroma_db/ - Vector database storage
  • indexed_files.json - File tracking metadata

Default configuration:

{
  "model_name": "EmmanuelEA/eea-embedding-gemma",
  "chunk_size": 512,
  "chunk_overlap": 50,
  "max_results": 2,
  "embedding_batch_size": 32,
  "device_preference": "auto"
}

How It Works

  1. Indexing: Scans your codebase, chunks files, and generates embeddings using the embeddinggemma-300m model
  2. Storage: Saves embeddings locally in ChromaDB vector database
  3. Search: Converts your natural language query to embeddings and finds semantically similar code
  4. Results: Displays file paths, similarity scores, and code snippets

Features

  • Local Processing: No internet required, everything runs offline
  • Fast Indexing: embeddinggemma-300m model optimized for speed
  • Smart Chunking: Handles large files by splitting into manageable chunks
  • Beautiful Output: Rich console formatting with syntax highlighting
  • Incremental Updates: Only reindexes changed files
  • Flexible Filtering: Search by file type, limit results, custom patterns

Advanced Usage

Custom Ignore Patterns

Create a .odinoignore file in your project root:

# Ignore specific directories
build/
dist/
node_modules/

# Ignore file patterns
*.log
*.tmp
*.cache

Force Reindex

odino index . --force

Status Check

odino status

Troubleshooting

Model Download Issues

The embeddinggemma-300m model downloads automatically on first use. Ensure you have:

  • Stable internet connection for initial download
  • Sufficient disk space (~300MB for model)

Permission Errors

Make sure you have read permissions for files you want to index and write permissions for the .odino/ directory.

Memory Issues

For very large codebases, consider:

  • Reducing chunk size in configuration
  • Excluding large directories with .odinoignore
  • Indexing in batches

MPS (Apple Silicon) Memory Issues

If you encounter MPS backend out of memory errors on Apple Silicon:

  1. Reduce batch size in your .odino/config.json:
{
  "embedding_batch_size": 16,
  "device_preference": "auto"
}
  1. Force CPU usage for stable processing:
{
  "device_preference": "cpu"
}
  1. Use smaller batch sizes if memory issues persist:
{
  "embedding_batch_size": 8
}

The system automatically handles MPS memory management with:

  • Automatic batch processing in configurable sizes
  • MPS memory clearing after each batch
  • Automatic CPU fallback when MPS runs out of memory
  • Smart device selection based on availability

For advanced memory management configuration and more detailed troubleshooting, see MEMORY_MANAGEMENT.md.

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests if applicable
  5. Submit a pull request

License

This project is licensed under the GNU General Public License v3.0 - see LICENSE file for details.

Acknowledgments

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

odino-0.1.1.tar.gz (28.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

odino-0.1.1-py3-none-any.whl (28.3 kB view details)

Uploaded Python 3

File details

Details for the file odino-0.1.1.tar.gz.

File metadata

  • Download URL: odino-0.1.1.tar.gz
  • Upload date:
  • Size: 28.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for odino-0.1.1.tar.gz
Algorithm Hash digest
SHA256 6a64e59d17210505f847053ab533aec464c2a5d5e0b55d62a7a597704018a6b3
MD5 1ec9a4db5fc20ddecc04980bd8588759
BLAKE2b-256 cd9c0a773fd6b7befe77b9eb08cbc9e65ac43759368a7234b66656f26080b4e9

See more details on using hashes here.

File details

Details for the file odino-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: odino-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 28.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.1

File hashes

Hashes for odino-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c9c74eabbcffb7bf39ea380aa1f4bd4c813dd2ef1ba80f30dd30bc092bc36a0e
MD5 ecb401ff6d7836ce11d62a7c39cb8d72
BLAKE2b-256 a28aa8cfa2ada2067d955412d1943c3d2ab427524137231097313a63b7d35d6a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page