Skip to main content

Sync your git repository to searchable index. Fully based on git diff. No need extra DB.

Project description

MinSync

Release Build status License

Git diff-based incremental vector index for any repository. No external DB required.

What it does

MinSync watches your git repository and keeps a local vector index in sync. It uses git diff to detect changes and only re-embeds what actually changed — no full re-indexing needed.

  • Indexes all git-tracked files (respects .gitignore automatically)
  • Incremental sync via git diff — only changed chunks get re-embedded
  • Deterministic chunk IDs — same content always produces the same ID
  • Crash-safe — interrupted syncs recover automatically
  • Embedded vector DB (zvec) — no server, just a local file in .minsync/
  • .minsyncignore for excluding files you don't want indexed

Install

pip install minsync[zvec]

Or with uv:

uv add "minsync[zvec]"

For development (editable install):

uv add --editable "/path/to/MinSync[zvec]"

Embedding model setup

OpenAI (default):

pip install langchain-openai
export OPENAI_API_KEY="sk-..."

HuggingFace (local, no API key):

pip install langchain-huggingface sentence-transformers

Quick start

cd your-repo

# Initialize
minsync init
# or with HuggingFace:
minsync init --embedder "huggingface:sentence-transformers/all-MiniLM-L6-v2"

# Build initial index
minsync sync

# Search
minsync query "authentication flow" --k 5

# After pulling new changes
git pull
minsync sync          # only re-embeds changed files

# Check index health
minsync verify

CLI commands

Command Description
minsync init Initialize .minsync/ in current git repo
minsync sync Incremental sync (or --full for rebuild)
minsync query "text" Semantic search over indexed content
minsync status Show sync status (up-to-date / behind / interrupted)
minsync check Verify environment and dependencies
minsync verify Check index consistency (with --fix to repair)

Use minsync <command> --help for full option details.

Python API

from minsync import MinSync
from pathlib import Path

ms = MinSync(repo_path=Path("/path/to/repo"))
ms.init()
ms.sync()
results = ms.query("search text", k=10)

# Custom components
ms = MinSync(
    repo_path=Path("/path/to/repo"),
    chunker=MyChunker(),        # implements Chunker protocol
    embedder=MyEmbedder(),      # implements Embedder protocol
    vector_store=MyStore(),     # implements VectorStore protocol
)

.minsyncignore

Works like .gitignore. Add patterns for git-tracked files you don't want indexed:

# Build artifacts
dist/
blog/

# Attachments
attachments/
*.png
*.pdf

# Config files
pyproject.toml
uv.lock

How it works

  1. git diff detects changed files since last sync
  2. Changed files are re-chunked (markdown heading-based by default)
  3. Each chunk gets a deterministic ID from sha256(repo_id + path + content_hash + ...)
  4. Only chunks with new IDs get embedded — unchanged chunks skip the API call
  5. Stale chunks (old content) are automatically swept

All state lives in .minsync/ — delete it to start fresh.

Development

git clone https://github.com/NomaDamas/MinSync.git
cd MinSync
make install
uv run pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

minsync-0.1.0.tar.gz (184.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

minsync-0.1.0-py3-none-any.whl (36.1 kB view details)

Uploaded Python 3

File details

Details for the file minsync-0.1.0.tar.gz.

File metadata

  • Download URL: minsync-0.1.0.tar.gz
  • Upload date:
  • Size: 184.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for minsync-0.1.0.tar.gz
Algorithm Hash digest
SHA256 09ed0240786d6ba521d3fcd7a11a9c3722558efacb830a9af729159973a70ff6
MD5 3edd30532df7106eca4dc6be3e456b34
BLAKE2b-256 4793c1b660a02deb5a0c7b40b2f8d574fca2e5831a2178f15a9d20d054306212

See more details on using hashes here.

File details

Details for the file minsync-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: minsync-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 36.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.28 {"installer":{"name":"uv","version":"0.9.28","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for minsync-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 37d3a601e00b1ffe7c002e0f6be20b884d46ad6782b1e63263269dc818855b32
MD5 0b435edce65d4919935e9d7cc22a0aea
BLAKE2b-256 69c6610dece9bc6865285c83def6c2b0f8e0462ad3b44ea265e3dbdfcf0c5cbe

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page