Embedding pipeline ops + drift detection for production RAG: index manifests, version assertions, neighbor-stability eval, Drift-Adapter for in-place model migrations.

These details have not been verified by PyPI

Project links

Project description

embspec

Embedding pipeline ops + drift detection for production RAG.

The single failure mode this library prevents: query encoder upgrade ships before the index is re-encoded; every health check stays 200 OK while retrieval accuracy silently collapses. The decompressed.io RAG observability post-mortem (2026-03-09) describes this exact bug — $15K of emergency re-encoding plus 2-5 days of engineer time before someone diagnosed it.

pip install embspec

from embspec import IndexManifest, embed_assert

manifest = IndexManifest.load("s3://my-rag/index-prod/manifest.json")

@embed_assert(manifest, model_id="amazon.titan-embed-text-v2:0", dimension=1024)
def search(query: str):
    qv = embed_query(query)
    return opensearch.knn_search(qv, ...)

If the query encoder ever drifts off the manifest, search raises EmbeddingVersionMismatch instead of silently returning bad results.

What's in v0.1

Primitive	What it does	Anchored in
`IndexManifest` + `EmbeddingSpec`	Single-file source of truth for "what does this index contain" — model id, dimension, version, normalization	decompressed.io post-mortem
`assert_compatible()` / `embed_assert()` decorator	Fast-fail on encoder/index drift; raise or warn modes	same
`DriftAdapter`	Linear adapter from new-model embeddings to old-model space; lets you swap the query encoder without re-encoding the corpus	Drift-Adapter, Vejendla 2025 (arxiv:2509.23471)
`neighbor_stability()`	Compare two retrievers on a frozen probe set; reports overlap, Jaccard, regression list, deploy-safety verdict	RAGOps survey, Xu et al. 2025 (arxiv:2506.03401)

Why not Evidently / Phoenix / WhyLogs?

Evidently — tabular-ML drift heritage; LLM additions are recent and platform-shaped. Not a drop-in primitive.
Phoenix — embedding-drift visualization is a sub-feature of a full observability platform. You adopt the platform.
WhyLogs — generic data-logging primitive; not embedding-aware; last commit 2025-01.
embspec — three small primitives (IndexManifest, DriftAdapter, neighbor_stability) you compose with whatever vector DB and tracer you already have. No platform, no UI, no agent framework.

Usage

Manifest + version assertion

from datetime import datetime, timezone
from embspec import IndexManifest, EmbeddingSpec

# When you build the index, write a manifest alongside it
manifest = IndexManifest(
    index_name="prod-v3",
    embedding=EmbeddingSpec(
        model_id="amazon.titan-embed-text-v2:0",
        dimension=1024,
        normalization="l2",
    ),
    created_at=datetime.now(timezone.utc),
    doc_count=8_000_000,
)
manifest.save("s3://my-rag/index-prod/manifest.json")
# (or any local path; manifest.save uses pathlib.Path.write_text)

# At query time, assert the encoder matches before searching
from embspec import embed_assert

@embed_assert(
    "s3://my-rag/index-prod/manifest.json",  # path or IndexManifest
    model_id="amazon.titan-embed-text-v2:0",
    dimension=1024,
    mode="raise",  # or "log" for canary rollout
)
def search(query: str) -> list[dict]:
    qv = embed_query(query)
    return opensearch.knn_search(qv, ...)

Drift-Adapter for in-place model migration

When you want to upgrade the query encoder without re-encoding 8M docs:

from embspec import DriftAdapter
import numpy as np

# Sample, e.g., 50K docs and embed them with both old and new models
old_emb = embed_with_old_model(sample_texts)  # shape (50000, 1024)
new_emb = embed_with_new_model(sample_texts)  # shape (50000, 1536)

adapter = DriftAdapter.fit(
    new_embeddings=new_emb,
    old_embeddings=old_emb,
    regularization=0.01,  # ridge; helps when new_emb is rank-deficient
)
adapter.save("s3://my-rag/adapters/v3-to-v4.npz")

# At query time, embed with the new model then transform into old space
qv_new = embed_with_new_model(query)
qv_compatible = adapter.transform(qv_new)  # shape (1024,)
results = opensearch.knn_search(qv_compatible, ...)

Per Vejendla 2025, this typically recovers 95-99% retrieval at ~1% of the cost of re-encoding the full corpus.

Neighbor stability for safe migrations

from embspec import neighbor_stability

# Run a fixed probe set against both indexes
old_results = {pid: retrieve_from_v3(q) for pid, q in probes.items()}
new_results = {pid: retrieve_from_v4(q) for pid, q in probes.items()}

report = neighbor_stability(old_results, new_results, k=10)
print(f"mean overlap@10: {report.mean_overlap_at_k:.1%}")
print(f"regressions: {report.regression_count}/{report.n_probes}")

if report.is_safe_to_deploy(min_mean_overlap=0.85, max_regression_fraction=0.05):
    deploy_v4()
else:
    investigate(report.regression_probe_ids)

What it explicitly does NOT do

Not a vector database.
Not a RAG framework. No retriever, no chunker, no generator.
Not a generic ML drift library. Embedding-and-retrieval-shaped only.
Not an eval framework. neighbor_stability is the one judgment you can make without a labeled gold set; for richer evals use ragas, trulens, or a tracer.
Does not host or serve embeddings.

Roadmap

v0.2: dual_write() context manager for blue/green index migrations across OpenSearch / pgvector / Pinecone / Qdrant.
v0.3: ChunkingExperiment A/B harness with optional LLM-judge.
v0.4: integration helpers for AWS Bedrock embedding models, OpenAI, Cohere, Voyage.

License

Apache-2.0. See LICENSE.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

May 9, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

embspec-0.1.0.tar.gz (13.1 kB view details)

Uploaded May 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

embspec-0.1.0-py3-none-any.whl (16.3 kB view details)

Uploaded May 9, 2026 Python 3

File details

Details for the file embspec-0.1.0.tar.gz.

File metadata

Download URL: embspec-0.1.0.tar.gz
Upload date: May 9, 2026
Size: 13.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for embspec-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`f03d8f05d9cc4e7db508380aa2453cea034a2f6d5970dfccabc343c09fac5ea5`
MD5	`4f76f598d4595bb0f9f1815de330eb24`
BLAKE2b-256	`f9169d42ee7594de28bf4817e1d6b81febd79fbe2d8e3df8abf51282f479a7d4`

See more details on using hashes here.

File details

Details for the file embspec-0.1.0-py3-none-any.whl.

File metadata

Download URL: embspec-0.1.0-py3-none-any.whl
Upload date: May 9, 2026
Size: 16.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for embspec-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`163b7100c40a971bb989f1675d5332b610e8debd08572316a2a4a1e0ee52cebf`
MD5	`3ccc1ba9c19ffa8fc26a87bf0cdf62d5`
BLAKE2b-256	`fba9b8f4111fa99a68c37bebf07cc5d337595ee58d9d7c9346194156af53b9dc`

See more details on using hashes here.

embspec 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

embspec

What's in v0.1

Why not Evidently / Phoenix / WhyLogs?

Usage

Manifest + version assertion

Drift-Adapter for in-place model migration

Neighbor stability for safe migrations

What it explicitly does NOT do

Roadmap

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes