Skip to main content

Python-first tooling for safe embedding-model migration across vector retrieval systems.

Project description

vectormigrate logo vectormigrate

Python-first tooling for safe embedding-model migration across vector retrieval systems.

PyPI version Python Versions License: MIT Tests Code style: ruff Checked with mypy


💡 TL;DR

Changing your AI embedding model usually means downtime, full re-embedding costs, or silent ranking corruption. vectormigrate makes this transition safe, structured, and mathematical. It provides a formal ABI (Application Binary Interface) for vectors, allowing you to seamlessly test, evaluate, and transition between different embedding models in live production systems (like OpenSearch, Weaviate, Qdrant, and pgvector).


🎯 Why This Library Exists

When a team upgrades an embedding stack, it often changes more than just a model name. You might change:

  • 📏 Vector Dimension
  • 📐 Similarity Metric
  • ⚖️ Normalization Policy
  • ✂️ Chunking and Preprocessing
  • 🗄️ Backend Index Shape

In practice, teams fall into three failure modes:

  1. 💥 Full re-embed & hard cutover: Expensive, risky, and causes downtime.
  2. 🎭 Mixing old & new vectors: Silently corrupts the ranking math.
  3. 📜 Vendor-specific throwaway scripts: Weak testing and no reusable governance.

vectormigrate treats every embedding configuration as an explicit compatibility contract and turns migration into a staged, testable workflow.


📦 Dependencies

By design, vectormigrate is lightweight and keeps your production environment lean:

  • Core: numpy >= 1.26 (The exact mathematical framework needed; minimal bloat)
  • Integration (Optional): psycopg[binary] >= 3.2.0 (for pgvector target databases)
  • Dev/Test (Optional): pytest, ruff, mypy, build

🚀 Quick Start

Installation

# Core install
pip install vectormigrate

# With live backend integrations (e.g., pgvector)
pip install "vectormigrate[integration]"

1️⃣ Register an Embedding ABI (The Contract)

from vectormigrate import EmbeddingABI, SQLiteRegistry

registry = SQLiteRegistry("/tmp/vectormigrate.sqlite")
abi = EmbeddingABI(
    model_id="text-embedding-3-large",
    provider="openai",
    version="2026.03",
    dimensions=3072,
)
registry.register_abi(abi)
print(f"Registered ABI: {abi.abi_id}")

2️⃣ Create a Migration Plan

from vectormigrate import MigrationPlan

plan = MigrationPlan(
    source_abi_id="openai/text-embedding-3-large@2026.03#v1",
    target_abi_id="openai/text-embedding-3-large@2026.04#v1",
    alias_name="retrieval_active",
)
registry.create_plan(plan)
print(f"Active Plan ID: {plan.plan_id}")

3️⃣ Run a Live Demo CLI

Watch the orchestrator securely manage a dual-write and backfill migration locally:

python3 -m vectormigrate.cli demo --db /tmp/vectormigrate-demo.sqlite

📚 Feature Guide

🔀 Compatibility Adapters

Don't want to re-embed everything right away? Use our built-in mathematical space adapters to query old vectors with new models during the transition window:

from vectormigrate import OrthogonalProcrustesAdapter, LowRankAffineAdapter, ResidualMLPAdapter

procrustes = OrthogonalProcrustesAdapter()
affine = LowRankAffineAdapter(rank=4)
mlp = ResidualMLPAdapter(hidden_dim=16, epochs=50, learning_rate=0.01)

📊 Artifact & Report Export

Prove to your team that the migration was safe with exported dashboards and artifacts:

from vectormigrate import export_run_artifact_bundle

manifest = export_run_artifact_bundle(
    registry=registry,
    plan_id="plan-123",
    output_dir="/tmp/vectormigrate-artifacts",
)

🏗️ Architecture & Formal Model

vectormigrate separates the migration problem into four robust planes:

  1. Control plane: ABI manifests, migration plans, audit events.
  2. Execution plane: Provisioning, dual-write, backfill, alias swap, rollback.
  3. Compatibility plane: Mathematical projections and confidence-gated routing.
  4. Evaluation plane: Offline metrics (Recall@k, nDCG@k), shadow hooks.

Migration Architecture Flow

Supported Live Backends

The library includes native adapters to safely orchestrate migrations on the following engines:

  • OpenSearch
  • Weaviate
  • Qdrant
  • pgvector
  • In-Memory (for testing)

📖 Deep Dive Documentation


🤝 Contributing & Security

We welcome contributions! Please see:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vectormigrate-1.0.0.tar.gz (852.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vectormigrate-1.0.0-py3-none-any.whl (38.1 kB view details)

Uploaded Python 3

File details

Details for the file vectormigrate-1.0.0.tar.gz.

File metadata

  • Download URL: vectormigrate-1.0.0.tar.gz
  • Upload date:
  • Size: 852.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for vectormigrate-1.0.0.tar.gz
Algorithm Hash digest
SHA256 71a43d03cf2e22b2d15eb8243cb0e4ed55213f5e67c14d43bd44c1da2c2239fe
MD5 34fe0976126ae8d47936d3b6411e5cd0
BLAKE2b-256 75adc3ce120c50b5b3d0b9b8397ee4a9cd5bf8df3ea80184dfd097b75b95a76c

See more details on using hashes here.

File details

Details for the file vectormigrate-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: vectormigrate-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 38.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.0

File hashes

Hashes for vectormigrate-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 9bacd411db04138a1e9435da6f1e41140a1eb5baed83818f3d09f5f3bc09077f
MD5 97bfae400efad2718561f75c95cb7065
BLAKE2b-256 da34b14a44c4421900cc8adfd08e4430ec92f455fc2ea770bbf81d6e11f93617

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page