Skip to main content

Python-first tooling for safe embedding-model migration across vector retrieval systems.

Project description

vectormigrate logo vectormigrate

Python-first tooling for safe embedding-model migration across vector retrieval systems.

PyPI version Python Versions License: MIT Tests Code style: ruff Checked with mypy


💡 TL;DR

Changing your AI embedding model usually means downtime, full re-embedding costs, or silent ranking corruption. vectormigrate makes this transition safe, structured, and mathematical. It provides a formal ABI (Application Binary Interface) for vectors, allowing you to seamlessly test, evaluate, and transition between different embedding models in live production systems (like OpenSearch, Weaviate, Qdrant, and pgvector).


🎯 Why This Library Exists

When a team upgrades an embedding stack, it often changes more than just a model name. You might change:

  • 📏 Vector Dimension
  • 📐 Similarity Metric
  • ⚖️ Normalization Policy
  • ✂️ Chunking and Preprocessing
  • 🗄️ Backend Index Shape

In practice, teams fall into three failure modes:

  1. 💥 Full re-embed & hard cutover: Expensive, risky, and causes downtime.
  2. 🎭 Mixing old & new vectors: Silently corrupts the ranking math.
  3. 📜 Vendor-specific throwaway scripts: Weak testing and no reusable governance.

vectormigrate treats every embedding configuration as an explicit compatibility contract and turns migration into a staged, testable workflow.


📦 Dependencies

By design, vectormigrate is lightweight and keeps your production environment lean:

  • Core: numpy >= 1.26 (The exact mathematical framework needed; minimal bloat)
  • Integration (Optional): psycopg[binary] >= 3.2.0 (for pgvector target databases)
  • Dev/Test (Optional): pytest, ruff, mypy, build

🚀 Quick Start

Installation

# Core install
pip install vectormigrate

# With live backend integrations (e.g., pgvector)
pip install "vectormigrate[integration]"

1️⃣ Register an Embedding ABI (The Contract)

from vectormigrate import EmbeddingABI, SQLiteRegistry

registry = SQLiteRegistry("/tmp/vectormigrate.sqlite")
abi = EmbeddingABI(
    model_id="text-embedding-3-large",
    provider="openai",
    version="2026.03",
    dimensions=3072,
)
registry.register_abi(abi)
print(f"Registered ABI: {abi.abi_id}")

2️⃣ Create a Migration Plan

from vectormigrate import MigrationPlan

plan = MigrationPlan(
    source_abi_id="openai/text-embedding-3-large@2026.03#v1",
    target_abi_id="openai/text-embedding-3-large@2026.04#v1",
    alias_name="retrieval_active",
)
registry.create_plan(plan)
print(f"Active Plan ID: {plan.plan_id}")

3️⃣ Run a Live Demo CLI

Watch the orchestrator securely manage a dual-write and backfill migration locally:

python3 -m vectormigrate.cli demo --db /tmp/vectormigrate-demo.sqlite

📚 Feature Guide

🔀 Compatibility Adapters

Don't want to re-embed everything right away? Use our built-in mathematical space adapters to query old vectors with new models during the transition window:

from vectormigrate import OrthogonalProcrustesAdapter, LowRankAffineAdapter, ResidualMLPAdapter

procrustes = OrthogonalProcrustesAdapter()
affine = LowRankAffineAdapter(rank=4)
mlp = ResidualMLPAdapter(hidden_dim=16, epochs=50, learning_rate=0.01)

📊 Artifact & Report Export

Prove to your team that the migration was safe with exported dashboards and artifacts:

from vectormigrate import export_run_artifact_bundle

manifest = export_run_artifact_bundle(
    registry=registry,
    plan_id="plan-123",
    output_dir="/tmp/vectormigrate-artifacts",
)

🏗️ Architecture & Formal Model

vectormigrate separates the migration problem into four robust planes:

  1. Control plane: ABI manifests, migration plans, audit events.
  2. Execution plane: Provisioning, dual-write, backfill, alias swap, rollback.
  3. Compatibility plane: Mathematical projections and confidence-gated routing.
  4. Evaluation plane: Offline metrics (Recall@k, nDCG@k), shadow hooks.

Migration Architecture Flow

Supported Live Backends

The library includes native adapters to safely orchestrate migrations on the following engines:

  • OpenSearch
  • Weaviate
  • Qdrant
  • pgvector
  • In-Memory (for testing)

📖 Deep Dive Documentation


🤝 Contributing & Security

We welcome contributions! Please see:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vectormigrate-1.0.1.tar.gz (852.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vectormigrate-1.0.1-py3-none-any.whl (38.1 kB view details)

Uploaded Python 3

File details

Details for the file vectormigrate-1.0.1.tar.gz.

File metadata

  • Download URL: vectormigrate-1.0.1.tar.gz
  • Upload date:
  • Size: 852.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for vectormigrate-1.0.1.tar.gz
Algorithm Hash digest
SHA256 f8ee35c8c3be37a2b131aefc05b382e4582b3e0e23015f41622c11de0ee34da4
MD5 d01eefc50ae6d1175891c1756e6569f8
BLAKE2b-256 d22859bb2a61ce0b71d43e7f2434db31e4b369ed382d041a1e395828f2b9f4df

See more details on using hashes here.

File details

Details for the file vectormigrate-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: vectormigrate-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 38.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for vectormigrate-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 d65c1177e6ced6a08266d7eed58f151c4dd83a36c870a63bf2792e31c40bd206
MD5 da7bb7252a3332f88ffc3c318566038c
BLAKE2b-256 258eb6df039b406c018be22d0cdc7d29f20d6dabb791d6e4de797490ef688c75

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page