Cross-platform vector database engine with pluggable adapters

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

twodev

These details have not been verified by PyPI

Project description

CrossVector

Cross-platform Vector Database Engine

A flexible, production-ready vector database engine with pluggable adapters for multiple vector databases (AstraDB, ChromaDB, Milvus, PGVector) and embedding providers (OpenAI, Gemini, and more).

Simplify your vector search infrastructure with a single, unified API across all major vector databases.

Features

Pluggable Architecture: Easy adapter pattern for both databases and embeddings
Multiple Vector Databases: AstraDB, ChromaDB, Milvus, PGVector
Multiple Embedding Providers: OpenAI (Gemini coming soon)
Install Only What You Need: Optional dependencies per adapter
Type-Safe: Full Pydantic validation
Consistent API: Same interface across all adapters

Supported Vector Databases

Database	Status	Features
AstraDB	✅ Production	Cloud-native Cassandra, lazy initialization
ChromaDB	✅ Production	Cloud/HTTP/Local modes, auto-fallback
Milvus	✅ Production	Auto-indexing, schema validation
PGVector	✅ Production	PostgreSQL extension, JSONB metadata

Supported Embedding Providers

Provider	Status	Models
OpenAI	✅ Production	text-embedding-3-small, 3-large, ada-002
Gemini	✅ Production	text-embedding-004, gemini-embedding-001

Installation

Minimal (core only)

pip install crossvector

With specific adapters

# AstraDB + OpenAI
pip install crossvector[astradb,openai]

# ChromaDB + OpenAI
pip install crossvector[chromadb,openai]

# All databases + OpenAI
pip install crossvector[all-dbs,openai]

# Everything
pip install crossvector[all]

Quick Start

from crossvector import VectorEngine, Document, UpsertRequest, SearchRequest
from crossvector.embeddings.openai import OpenAIEmbeddingAdapter
from crossvector.dbs.astradb import AstraDBAdapter

# Initialize engine
engine = VectorEngine(
    embedding_adapter=OpenAIEmbeddingAdapter(model_name="text-embedding-3-small"),
    db_adapter=AstraDBAdapter(),
    collection_name="my_documents"
)

# Upsert documents
docs = [
    Document(id="doc1", text="The quick brown fox", metadata={"category": "animals"}),
    Document(id="doc2", text="Artificial intelligence", metadata={"category": "tech"}),
]
result = engine.upsert(UpsertRequest(documents=docs))
print(f"Inserted {result['count']} documents")

# Search
results = engine.search(SearchRequest(query="AI and ML", limit=5))
for doc in results:
    print(f"Score: {doc.get('$similarity', 'N/A')}, Text: {doc.get('text')}")

# Get document by ID
doc = engine.get("doc1")

# Count documents
count = engine.count()

# Delete documents
engine.delete_one("doc1")
engine.delete_many(["doc2", "doc3"])

Configuration

Environment Variables

Create a .env file:

# OpenAI (for embeddings)
OPENAI_API_KEY=sk-...

# AstraDB
ASTRA_DB_APPLICATION_TOKEN=AstraCS:...
ASTRA_DB_API_ENDPOINT=https://...
ASTRA_DB_COLLECTION_NAME=my_collection

# ChromaDB Cloud
CHROMA_API_KEY=...
CHROMA_CLOUD_TENANT=...
CHROMA_CLOUD_DATABASE=...

# Milvus
MILVUS_API_ENDPOINT=https://...
MILVUS_USER=...
MILVUS_PASSWORD=...

# PGVector
PGVECTOR_HOST=localhost
PGVECTOR_PORT=5432
PGVECTOR_DBNAME=vectordb
PGVECTOR_USER=postgres
PGVECTOR_PASSWORD=...

# Vector metric (cosine, dot_product, euclidean)
VECTOR_METRIC=cosine

Database-Specific Examples

AstraDB

from crossvector.dbs.astradb import AstraDBAdapter

adapter = AstraDBAdapter()
adapter.initialize(
    collection_name="my_collection",
    embedding_dimension=1536,
    metric="cosine"
)

ChromaDB

from crossvector.dbs.chroma import ChromaDBAdapter

# Local mode
adapter = ChromaDBAdapter()

# Cloud mode (auto-detected from env vars)
# CHROMA_API_KEY, CHROMA_CLOUD_TENANT, CHROMA_CLOUD_DATABASE
adapter = ChromaDBAdapter()

adapter.initialize(
    collection_name="my_collection",
    embedding_dimension=1536
)

Milvus

from crossvector.dbs.milvus import MilvusDBAdapter

adapter = MilvusDBAdapter()
adapter.initialize(
    collection_name="my_collection",
    embedding_dimension=1536,
    metric="cosine"
)

PGVector

from crossvector.dbs.pgvector import PGVectorAdapter

adapter = PGVectorAdapter()
adapter.initialize(
    table_name="my_vectors",
    embedding_dimension=1536,
    metric="cosine"
)

Custom Adapters

Create Custom Database Adapter

from crossvector.abc import VectorDBAdapter
from typing import Any, Dict, List, Set

class MyCustomDBAdapter(VectorDBAdapter):
    def initialize(self, collection_name: str, embedding_dimension: int, metric: str = "cosine"):
        # Your implementation
        pass

    def get_collection(self, collection_name: str, embedding_dimension: int, metric: str = "cosine"):
        # Your implementation
        pass

    def upsert(self, documents: List[Dict[str, Any]]):
        # Your implementation
        pass

    def search(self, vector: List[float], limit: int, fields: Set[str]) -> List[Dict[str, Any]]:
        # Your implementation
        pass

    def get(self, id: str) -> Dict[str, Any] | None:
        # Your implementation
        pass

    def count(self) -> int:
        # Your implementation
        pass

    def delete_one(self, id: str) -> int:
        # Your implementation
        pass

    def delete_many(self, ids: List[str]) -> int:
        # Your implementation
        pass

Create Custom Embedding Adapter

from crossvector.abc import EmbeddingAdapter
from typing import List

class MyCustomEmbeddingAdapter(EmbeddingAdapter):
    def __init__(self, model_name: str):
        super().__init__(model_name)
        # Initialize your client

    @property
    def embedding_dimension(self) -> int:
        return 768  # Your model's dimension

    def get_embeddings(self, texts: List[str]) -> List[List[float]]:
        # Your implementation
        pass

Document Format

All adapters expect documents in this standard format:

{
    "_id": "unique-doc-id",              # Document ID (string)
    "$vector": [0.1, 0.2, ...],         # Embedding vector (List[float])
    "text": "original text content",     # Original text
    "any_field": "value",                # Additional metadata fields
    "another_field": 123,
}

Development

# Clone repository
git clone https://github.com/thewebscraping/crossvector.git
cd crossvector

# Install with dev dependencies
pip install -e ".[all,dev]"

# Run tests
pytest

# Run linting
ruff check .

# Format code
ruff format .

# Setup pre-commit hooks
pre-commit install

Testing

# Run all tests
pytest

# Run with coverage
pytest --cov=. --cov-report=html

# Run specific adapter tests
pytest tests/test_gemini_embeddings.py
pytest tests/test_openai_embeddings.py

License

MIT License - see LICENSE file for details

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Roadmap

Gemini embedding adapter
Qdrant adapter (not supported yet)
Pinecone adapter (not supported yet)
Weaviate adapter (not supported yet)
Async support
Batch operations optimization
Advanced filtering
Hybrid search (vector + keyword)
Rerank support (planned)
Additional embedding providers (e.g., Cohere, Mistral, Ollama)

Support

For issues and questions:

GitHub Issues: https://github.com/thewebscraping/crossvector/issues
Email: thetwofarm@gmail.com

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

twodev

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.0.1

Dec 7, 2025

1.0.0

Dec 6, 2025

0.1.3

Nov 30, 2025

0.1.2

Nov 23, 2025

This version

0.1.0

Nov 23, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crossvector-0.1.0.tar.gz (30.9 kB view details)

Uploaded Nov 23, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

crossvector-0.1.0-py3-none-any.whl (22.8 kB view details)

Uploaded Nov 23, 2025 Python 3

File details

Details for the file crossvector-0.1.0.tar.gz.

File metadata

Download URL: crossvector-0.1.0.tar.gz
Upload date: Nov 23, 2025
Size: 30.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for crossvector-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`ff03a9800a1b005ce1d42c526fc756923678c2402f16dbf3dd0ab15ff47e1e47`
MD5	`3e66e210a8ac6963c304a156bc16a26a`
BLAKE2b-256	`937761e2762720ce98a1debba58d65dbf9c0f627b817085b68985b446618fa67`

See more details on using hashes here.

Provenance

The following attestation bundles were made for crossvector-0.1.0.tar.gz:

Publisher: publish.yml on thewebscraping/crossvector

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: crossvector-0.1.0.tar.gz
- Subject digest: ff03a9800a1b005ce1d42c526fc756923678c2402f16dbf3dd0ab15ff47e1e47
- Sigstore transparency entry: 717275249
- Sigstore integration time: Nov 23, 2025
Source repository:
- Permalink: thewebscraping/crossvector@eb6c14d32303a0fb0eead761ed3024cb1cf3e261
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/thewebscraping
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@eb6c14d32303a0fb0eead761ed3024cb1cf3e261
- Trigger Event: release

File details

Details for the file crossvector-0.1.0-py3-none-any.whl.

File metadata

Download URL: crossvector-0.1.0-py3-none-any.whl
Upload date: Nov 23, 2025
Size: 22.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for crossvector-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c759e5b78129f367a2dc5d89ae80832cd1a61cdad2a4881eefeaed646d155832`
MD5	`90bce9e1f0199f0a41b8803c9091729f`
BLAKE2b-256	`71c3fed5d727bea27197e14eed96fe5b3d5c5fe09ea3f16462d9af838286657f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for crossvector-0.1.0-py3-none-any.whl:

Publisher: publish.yml on thewebscraping/crossvector

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: crossvector-0.1.0-py3-none-any.whl
- Subject digest: c759e5b78129f367a2dc5d89ae80832cd1a61cdad2a4881eefeaed646d155832
- Sigstore transparency entry: 717275314
- Sigstore integration time: Nov 23, 2025
Source repository:
- Permalink: thewebscraping/crossvector@eb6c14d32303a0fb0eead761ed3024cb1cf3e261
- Branch / Tag: refs/tags/v0.1.0
- Owner: https://github.com/thewebscraping
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@eb6c14d32303a0fb0eead761ed3024cb1cf3e261
- Trigger Event: release

crossvector 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

CrossVector

Cross-platform Vector Database Engine

Features

Supported Vector Databases

Supported Embedding Providers

Installation

Minimal (core only)

With specific adapters

Quick Start

Configuration

Environment Variables

Database-Specific Examples

AstraDB

ChromaDB

Milvus

PGVector

Custom Adapters

Create Custom Database Adapter

Create Custom Embedding Adapter

Document Format

Development

Testing

License

Contributing

Roadmap

Support

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance