Skip to main content

A vector database optimized for Mistral AI embeddings with HNSW indexing

Project description

MistralVDB

A vector database optimized for Mistral AI embeddings with HNSW indexing. Features include:

  • Efficient similarity search using HNSW indexing
  • Vector compression for reduced memory usage
  • Built-in REST API server
  • Secure authentication
  • Multiple collection support
  • Automatic persistence

Installation

pip install mistralvdb

Quick Start

As a Python Library

from mistralvdb import MistralVDB

# Initialize database
db = MistralVDB(
    api_key="your-mistral-api-key",
    collection_name="my_collection"
)

# Add documents
texts = [
    "The quick brown fox jumps over the lazy dog",
    "Python is a versatile programming language"
]
metadata = [
    {"type": "example"},
    {"type": "programming"}
]
ids = db.add_texts(texts, metadata=metadata)

# Search
results = db.search("programming language", k=4)
for doc_id, score in results:
    print(f"Score: {score}")
    print(f"Text: {db.get_text(doc_id)}")
    print(f"Metadata: {db.get_metadata(doc_id)}\n")

As an API Server

  1. Start the server:
# Set your API key
export MISTRAL_API_KEY=your-mistral-api-key

# Start server
mistralvdb-server --host 127.0.0.1 --port 8000
  1. Use the client:
from mistralvdb.client import MistralVDBClient

# Create client
client = MistralVDBClient()

# Login (default credentials)
client.login(username="admin", password="admin-password")

# Add documents
docs = ["Document 1", "Document 2"]
client.add_documents(docs, collection_name="my_collection")

# Search
results = client.search("my query", collection_name="my_collection")

Features

HNSW Indexing

  • Efficient approximate nearest neighbor search
  • Configurable index parameters (M, ef_construction, ef)
  • Thread-safe operations

Vector Compression

  • Product Quantization for reduced memory usage
  • Configurable compression parameters
  • Minimal accuracy loss

API Server

  • RESTful API with FastAPI
  • JWT authentication
  • Collection management
  • Swagger UI documentation

Storage Management

  • Automatic persistence
  • Multiple collections
  • Custom storage location
  • Collection metadata

Configuration

Environment Variables

  • MISTRAL_API_KEY: Your Mistral AI API key (required)

Server Options

mistralvdb-server --help

Development

  1. Clone the repository:
git clone https://github.com/yourusername/mistralvdb.git
cd mistralvdb
  1. Install development dependencies:
pip install -e ".[dev]"
  1. Run tests:
pytest tests/

License

MIT License - see LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mistralvdb-0.1.0.tar.gz (19.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mistralvdb-0.1.0-py3-none-any.whl (14.3 kB view details)

Uploaded Python 3

File details

Details for the file mistralvdb-0.1.0.tar.gz.

File metadata

  • Download URL: mistralvdb-0.1.0.tar.gz
  • Upload date:
  • Size: 19.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.4

File hashes

Hashes for mistralvdb-0.1.0.tar.gz
Algorithm Hash digest
SHA256 046e0db0ab0373e22e4eda7f7acd91169e6a959dfe548c2f54c24318aef96b77
MD5 47d149c50e3a30cf59fe446613463a42
BLAKE2b-256 ba1bace817dbfcb6679f0bb6a9273cc5d10e55b577d174f4ebc1e76f648e223b

See more details on using hashes here.

File details

Details for the file mistralvdb-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: mistralvdb-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 14.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.0.1 CPython/3.12.4

File hashes

Hashes for mistralvdb-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6e775ec99bf2384dd62e2ce1a83e2a1a7d173549ba285c5c3fc3d8cd2de4353a
MD5 d1989a58cf67bc0da6e270962bf4d8ac
BLAKE2b-256 ce438dbfb252fc0346737177f3e5d558e309faea345b13520e73e10295735719

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page