A vector database optimized for Mistral AI embeddings with HNSW indexing
Project description
MistralVDB
A vector database optimized for Mistral AI embeddings with HNSW indexing. Features include:
- Efficient similarity search using HNSW indexing
- Vector compression for reduced memory usage
- Built-in REST API server
- Secure authentication
- Multiple collection support
- Automatic persistence
Installation
pip install mistralvdb
Quick Start
As a Python Library
from mistralvdb import MistralVDB
# Initialize database
db = MistralVDB(
api_key="your-mistral-api-key",
collection_name="my_collection"
)
# Add documents
texts = [
"The quick brown fox jumps over the lazy dog",
"Python is a versatile programming language"
]
metadata = [
{"type": "example"},
{"type": "programming"}
]
ids = db.add_texts(texts, metadata=metadata)
# Search
results = db.search("programming language", k=4)
for doc_id, score in results:
print(f"Score: {score}")
print(f"Text: {db.get_text(doc_id)}")
print(f"Metadata: {db.get_metadata(doc_id)}\n")
As an API Server
- Start the server:
# Set your API key
export MISTRAL_API_KEY=your-mistral-api-key
# Start server
mistralvdb-server --host 127.0.0.1 --port 8000
- Use the client:
from mistralvdb.client import MistralVDBClient
# Create client
client = MistralVDBClient()
# Login (default credentials)
client.login(username="admin", password="admin-password")
# Add documents
docs = ["Document 1", "Document 2"]
client.add_documents(docs, collection_name="my_collection")
# Search
results = client.search("my query", collection_name="my_collection")
Features
HNSW Indexing
- Efficient approximate nearest neighbor search
- Configurable index parameters (M, ef_construction, ef)
- Thread-safe operations
Vector Compression
- Product Quantization for reduced memory usage
- Configurable compression parameters
- Minimal accuracy loss
API Server
- RESTful API with FastAPI
- JWT authentication
- Collection management
- Swagger UI documentation
Storage Management
- Automatic persistence
- Multiple collections
- Custom storage location
- Collection metadata
Configuration
Environment Variables
MISTRAL_API_KEY: Your Mistral AI API key (required)
Server Options
mistralvdb-server --help
Development
- Clone the repository:
git clone https://github.com/yourusername/mistralvdb.git
cd mistralvdb
- Install development dependencies:
pip install -e ".[dev]"
- Run tests:
pytest tests/
License
MIT License - see LICENSE file for details.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
mistralvdb-0.1.0.tar.gz
(19.6 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mistralvdb-0.1.0.tar.gz.
File metadata
- Download URL: mistralvdb-0.1.0.tar.gz
- Upload date:
- Size: 19.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
046e0db0ab0373e22e4eda7f7acd91169e6a959dfe548c2f54c24318aef96b77
|
|
| MD5 |
47d149c50e3a30cf59fe446613463a42
|
|
| BLAKE2b-256 |
ba1bace817dbfcb6679f0bb6a9273cc5d10e55b577d174f4ebc1e76f648e223b
|
File details
Details for the file mistralvdb-0.1.0-py3-none-any.whl.
File metadata
- Download URL: mistralvdb-0.1.0-py3-none-any.whl
- Upload date:
- Size: 14.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.0.1 CPython/3.12.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6e775ec99bf2384dd62e2ce1a83e2a1a7d173549ba285c5c3fc3d8cd2de4353a
|
|
| MD5 |
d1989a58cf67bc0da6e270962bf4d8ac
|
|
| BLAKE2b-256 |
ce438dbfb252fc0346737177f3e5d558e309faea345b13520e73e10295735719
|