A framework for testing Milvus data correctness at scale
Project description
Milvus Correctness Testing Framework
A testing framework specifically designed to verify data correctness in Milvus vector database under large-scale write operations. This framework supports processing billions of data entries and provides reliable verification mechanisms.
Key Features
- Support for large-scale data processing (1B+ entries)
- Real workload simulation based on Markov chains
- Efficient disk storage backend (RocksDB/LMDB)
- Batch processing optimization
- Configurable sampling verification
- Detailed progress tracking and statistics
System Requirements
- Python >= 3.10
- Milvus >= 2.5.0
- PDM (Python package manager)
- Sufficient disk space for storing request queues and state data
Project Structure
src/milvus_correctness/
├── core/ # Core framework components
│ ├── framework.py # Main framework coordination
│ ├── milvus_client.py # Enhanced Milvus client wrapper
│ └── models.py # Data models and configurations
├── calculators/ # State calculation and verification
├── generators/ # Request generation components
├── processors/ # Request processing components
├── storage/ # Storage backend implementations
└── utils/ # Utility functions
Installation
# Install dependencies using PDM
pdm install
Quick Start
-
Ensure Milvus service is running and test collection is created
-
Run example test:
pdm run examples/small_scale_test.py
Framework Components
1. Core Framework (core/)
CorrectnessFramework: Main coordination classMilvusClientWrapper: Enhanced Milvus client with auto-reconnection- Configuration models and data structures
2. Request Generator (generators/)
- Markov chain-based operation sequence generation
- Probabilistic primary key selection
- Configurable operation distributions
- Efficient PK tracking with Bloom filters
3. Request Processor (processors/)
- Batch processing optimization
- Automatic retry mechanism
- Concurrent request handling
- Progress tracking and statistics
4. State Calculator (calculators/)
- Disk-based state tracking
- Stream processing of success logs
- Sampling-based verification
- Detailed consistency checks
5. Storage Backend (storage/)
- Persistent queue implementation
- State storage interface
Configuration Example
from milvus_correctness.core.models import (
GeneratorConfig,
MilvusConfig,
StorageConfig
)
# Milvus configuration
milvus_config = MilvusConfig(
host="localhost",
port=19530,
collection_name="test_collection"
)
# Generator configuration
generator_config = GeneratorConfig(
op_states=["INSERT", "UPSERT", "DELETE"],
transition_matrix={
"INSERT": {"INSERT": 0.7, "UPSERT": 0.2, "DELETE": 0.1},
"UPSERT": {"INSERT": 0.3, "UPSERT": 0.6, "DELETE": 0.1},
"DELETE": {"INSERT": 0.6, "UPSERT": 0.3, "DELETE": 0.1}
},
pk_field_name="id",
vector_field_name="vector",
vector_dim=128,
data_schema={
"metadata": "str"
}
)
# Storage configuration
storage_config = StorageConfig(
request_queue_path="data/request_queue",
success_log_path="data/success_log",
storage_type="rocksdb",
clear_on_start=True
)
Usage Example
from milvus_correctness.core.framework import CorrectnessFramework
# Initialize framework
framework = CorrectnessFramework(
milvus_config=milvus_config,
generator_config=generator_config,
storage_config=storage_config,
batch_size=1000,
max_retries=3
)
# Generate requests
framework.generate_requests(1_000_000)
# Process requests
framework.process_requests()
# Calculate and verify state
framework.calculate_expected_state()
is_consistent, stats = framework.verify_state(sample_size=10000)
# Get statistics
print(framework.get_statistics())
# Cleanup resources
framework.cleanup()
Contributing
Issues and Pull Requests are welcome!
License
MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file milvus_correctness-0.1.0.tar.gz.
File metadata
- Download URL: milvus_correctness-0.1.0.tar.gz
- Upload date:
- Size: 18.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: pdm/2.22.3 CPython/3.13.2 Darwin/22.6.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
10006b3d4ce27f5ffbca0c0af67e33c0facf5d87f17dc80309201d9b11b58f8b
|
|
| MD5 |
d020f62a76af7d1c4e0d7dc5bfdbd45e
|
|
| BLAKE2b-256 |
41d4391d1a62edd2d9615ba76b37f8e8c00b9d3312236c0a720b6a346fbda734
|
File details
Details for the file milvus_correctness-0.1.0-py3-none-any.whl.
File metadata
- Download URL: milvus_correctness-0.1.0-py3-none-any.whl
- Upload date:
- Size: 21.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: pdm/2.22.3 CPython/3.13.2 Darwin/22.6.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3679c146c8c89011c96eafcca05b52ac6c1e955998c143b9c22252db05ca375f
|
|
| MD5 |
ec8b19d299f97697fc6900a6f118ca3b
|
|
| BLAKE2b-256 |
e1d556c8be65e641f402e50289925d669ad3130ab1812e0928a79d18815c2582
|