Skip to main content

LLM-Powered Code Snippet Manager with vector search

Project description

SnipVault

Because grep isn't enough anymore

A code snippet manager that actually understands what you're looking for. Search your snippets using natural language, not exact keyword matches.

Built with PostgreSQL for storage, Pinecone for vector search, and Google Gemini/OpenAI for embeddings and query understanding.

What it does

You know that moment when you remember writing some useful code but can't find it? SnipVault fixes that.

Instead of searching for exact keywords, you can search like: "that react hook that stores data in localStorage" and it'll find your useLocalStorage hook even if you never used those exact words in the snippet.

The search combines semantic understanding (via vector embeddings) with traditional keyword matching. It also has fuzzy search for typos, ranking based on relevance/recency, and can show you related snippets you might have forgotten about.

Features

Search & Discovery:

  • Natural language search - describe what you need, get what you meant
  • Hybrid search combining semantic + keyword matching
  • Automatic typo correction
  • Related snippet suggestions
  • Smart ranking that considers relevance, recency, and quality

Managing Snippets:

  • Standard CRUD operations
  • Interactive mode for multi-line code (opens your $EDITOR)
  • Bulk indexing - point it at a directory and it'll extract functions/classes
  • Auto-tagging based on code content
  • Export/Import as JSON or Markdown
  • Copy snippets directly to clipboard

AI Providers:

  • Google Gemini (default) - text-embedding-004 + gemini-2.5-flash
  • OpenAI - text-embedding-3-small/large + GPT-4o-mini
  • Local models - 5 different sentence-transformers models, fully offline
  • Automatic fallback from cloud to local
  • Smart caching to save on API costs (reduces calls by 80%+)

Other Stuff:

  • GitHub integration - import entire repos or individual gists
  • Usage analytics and API cost tracking
  • Backup/restore with vectors included
  • Works with PostgreSQL or SQLite
  • Connection pooling and caching for performance

Installation

From PyPI (easiest):

pip install snipvault
snipvault init

Docker:

git clone https://github.com/yourusername/snipvault.git
cd snipvault
docker-compose up -d
docker exec -it snipvault-app snipvault init

From source:

git clone https://github.com/yourusername/snipvault.git
cd snipvault
pip install -r requirements.txt
python main.py init

Configuration

You need to set up a few things before using it:

  1. Database - Either PostgreSQL or SQLite

    For PostgreSQL:

    sudo apt-get install postgresql
    createdb snipvault
    

    For SQLite: nothing to do, it'll create the file automatically

  2. Environment variables - Create a .env file:

    # If using PostgreSQL
    POSTGRES_HOST=localhost
    POSTGRES_DB=snipvault
    POSTGRES_USER=your_user
    POSTGRES_PASSWORD=your_password
    
    # For vector search (optional, can use local embeddings)
    PINECONE_API_KEY=your_key
    PINECONE_ENVIRONMENT=us-east-1-aws
    
    # Pick one AI provider (or use local models)
    GEMINI_API_KEY=your_key     # Get from https://ai.google.dev/
    OPENAI_API_KEY=your_key     # Get from https://platform.openai.com/
    
    # For GitHub features (optional)
    GITHUB_TOKEN=your_token
    
  3. Config file (optional) - Edit ~/.snipvault/config.yaml:

    embeddings:
      provider: gemini  # or openai, or local
    
    llm:
      provider: gemini
    
    cache:
      enabled: true
      ttl: 86400
    
    database:
      backend: postgresql  # or sqlite
    

Running fully local: Set provider: local in the config and don't worry about API keys. Models will download automatically (~100MB) on first use.

Usage

Add a snippet:

# Interactive mode (opens your editor)
snipvault add --interactive

# Or inline
snipvault add \
  --title "FizzBuzz" \
  --code "for i in range(1,101): print('Fizz'*(i%3==0)+'Buzz'*(i%5==0) or i)" \
  --lang python \
  --tags algorithm,fizzbuzz

Search:

# Simple search
snipvault search "payment processing API"

# With filters
snipvault search "sorting algorithm" --lang python --tags algorithm

# Hybrid mode (semantic + exact keywords)
snipvault search "react hooks" --hybrid

# Pagination
snipvault search "database" --top 20 --page 2

List everything:

snipvault list
snipvault list --verbose  # shows full code

View/edit/delete:

snipvault show 5
snipvault show 5 --copy           # copy to clipboard
snipvault update 5 --edit         # edit in $EDITOR
snipvault update 5 --title "New"
snipvault delete 5

Bulk operations:

# Index an entire directory
snipvault index ~/projects/myapp

# With auto-tagging
snipvault index ~/code --auto-tag --exclude "node_modules,venv"

# Export/import
snipvault export snippets.json
snipvault export snippets.md --format markdown
snipvault import snippets.json

GitHub integration:

snipvault github-import user/repo
snipvault github-import user/repo --path src/utils

snipvault gist list
snipvault gist import gist_id
snipvault gist export 5 --public

Stats and backup:

snipvault stats
snipvault stats --days 30 --show-costs

snipvault backup create
snipvault backup create --include-vectors
snipvault backup restore backup-2025-11-09.tar.gz

How it works

When you add a snippet:

  1. Saves metadata (title, code, language, tags) to PostgreSQL/SQLite
  2. Generates a 768-dimension vector embedding of the combined text
  3. Stores the embedding in Pinecone with the snippet ID

When you search:

  1. Your query gets enhanced by the LLM (adds synonyms, related terms)
  2. Enhanced query converts to a 768-dim embedding
  3. Pinecone finds the most similar vectors using cosine similarity
  4. PostgreSQL fetches the full snippet details
  5. Results get ranked and displayed with syntax highlighting

The cache layer sits in front of the embedding API, so repeated queries are basically free.

Architecture

User Query
    ↓
Gemini LLM (enhance query)
    ↓
Gemini Embeddings (768 dims)
    ↓
Pinecone (vector similarity)
    ↓ (returns snippet IDs)
PostgreSQL (fetch metadata)
    ↓
Display with Rich

Performance

  • Add snippet: ~200ms (including embedding generation)
  • Search (cached): ~50ms
  • Search (API): ~500ms
  • Index 100 files: ~30s

Cache hit rate is around 85% for common queries, which saves about 80% of API costs.

Supports 60+ languages including Python, JavaScript, TypeScript, Java, C++, C#, Go, Rust, Ruby, PHP, Swift, Kotlin, Scala, SQL, and more.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

snipvault-1.0.0.tar.gz (80.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

snipvault-1.0.0-py3-none-any.whl (91.8 kB view details)

Uploaded Python 3

File details

Details for the file snipvault-1.0.0.tar.gz.

File metadata

  • Download URL: snipvault-1.0.0.tar.gz
  • Upload date:
  • Size: 80.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for snipvault-1.0.0.tar.gz
Algorithm Hash digest
SHA256 f66f4b1b8add4d5305d2fe2afe3a78317dfc3867d1d17cf02b7531d78ae5475d
MD5 66ca2153986d49c7dedd145bed496e50
BLAKE2b-256 e87089dd703e25e83f98fe176644767fb37cdc95dcab887ff83d535829b5d205

See more details on using hashes here.

File details

Details for the file snipvault-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: snipvault-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 91.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for snipvault-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 dbd389535074b54c784c51f2907332c440b3db643e7b1a873ed0b58f5d307ddc
MD5 a877546c0a6cf3c748768043df0332c9
BLAKE2b-256 06c3b1b4a52a0598f5d791c708ab9b94e5246ad33e456be44f2991b032735a9d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page