A super fast filesystem-based key-value store.
Project description
FastKV
A high-performance, file-system based key-value database for Python
FastKV is a pure-Python, high-performance key-value database designed for durability, speed, and simplicity. It implements a log-structured merge-tree (LSM-tree) architecture similar to RocksDB/LevelDB, optimized for SSD storage with asynchronous compaction, bloom filters, and configurable durability modes.
Features
- High Performance: Optimized for write-heavy workloads with sequential I/O patterns
- ACID Compliant: Write-ahead logging (WAL) ensures crash recovery
- Multiple Durability Modes: Choose between speed and safety
- Asynchronous Operations: Built-in async API for non-blocking I/O
- Efficient Storage: SSTable-based storage with Bloom filters and compression support
- Memory Efficient: Configurable memtable sizes and background compaction
- Thread-Safe: Designed for concurrent access
- Zero Dependencies: Pure Python implementation (optional msgpack for better performance)
- Command Line Interface: Built-in CLI for database operations
Installation
pip install fastkv
Quick Start
Synchronous API
from fastkv import FastKV
# Open a database
db = FastKV("./my_database")
# Store data
db.put("user:1001", {"name": "Alice", "age": 30, "email": "alice@example.com"})
db.put("user:1002", {"name": "Bob", "age": 25})
db.put("config:theme", "dark")
db.put("counter:visits", 42)
# Retrieve data
user = db.get("user:1001")
print(f"User: {user}") # {"name": "Alice", "age": 30, "email": "alice@example.com"}
# Scan with prefix
users = db.scan("user:")
for key, value in users:
print(f"{key}: {value}")
# Batch operations
db.batch_put([
("order:001", {"item": "Book", "price": 29.99}),
("order:002", {"item": "Pen", "price": 1.99})
])
# Delete keys
db.delete("config:theme")
# Get statistics
stats = db.stats()
print(f"Total keys: {stats['total_keys']}")
print(f"Memtable size: {stats['memtable_size']} bytes")
# Close the database
db.close()
Asynchronous API
import asyncio
from fastkv import AsyncFastKV
async def main():
async with AsyncFastKV("./my_async_db") as db:
# All operations are asynchronous
await db.put("async_key", "async_value")
value = await db.get("async_key")
print(f"Got: {value}")
# Batch operations
await db.batch_put([("a", 1), ("b", 2), ("c", 3)])
# Scan
results = await db.scan(prefix="", limit=10)
for key, val in results:
print(f"{key}: {val}")
asyncio.run(main())
Command Line Interface (CLI)
FastKV provides a comprehensive command-line interface accessible via python -m fastkv. This allows you to run tests, benchmarks, and use an interactive shell without writing code.
Basic Usage
# Run tests
python -m fastkv test
# Run performance benchmarks
python -m fastkv benchmark
# Start interactive shell
python -m fastkv shell --path ./my_database
# Run benchmark with custom database path
python -m fastkv benchmark --path ./benchmark_data
# Start shell with specific database location
python -m fastkv shell --path /var/lib/fastkv/app_data
Interactive Shell
The interactive shell provides a REPL (Read-Eval-Print Loop) interface for database operations:
python -m fastkv shell --path ./my_database
Shell Commands:
| Command | Syntax | Description | Example |
|---|---|---|---|
| put | put <key> <value> |
Store a key-value pair | put user:1001 '{"name": "Alice", "age": 30}' |
| get | get <key> |
Retrieve value by key | get user:1001 |
| delete | delete <key> |
Remove a key | delete user:1001 |
| scan | scan [prefix] [limit] |
Scan keys with prefix | scan user: 10 |
| stats | stats |
Show database statistics | stats |
| bulk | bulk <filename.json> |
Bulk load from JSON file | bulk data.json |
| exit | exit or quit |
Exit the shell | exit |
Interactive Shell Examples:
# Start the shell
$ python -m fastkv shell --path ./testdb
Database opened at ./testdb
fastkv>
# Store data
fastkv> put config:app_name "MyApp"
OK
fastkv> put user:1001 '{"name": "Alice", "active": true}'
OK
# Retrieve data
fastkv> get user:1001
{
"name": "Alice",
"active": true
}
# Scan with prefix
fastkv> scan user:
user:1001: {"name": "Alice", "active": true}
Total: 1 items
# Scan with limit
fastkv> scan "" 5
config:app_name: "MyApp"
user:1001: {"name": "Alice", "active": true}
Total: 2 items
# View statistics
fastkv> stats
{
"memtable_size": 2048,
"memtable_keys": 2,
"immutable_memtables": 0,
"immutable_keys": 0,
"total_sstables": 0,
"total_keys": 2,
"sstable_stats": {},
"seq_num": 2
}
# Bulk load from JSON file
fastkv> bulk data.json
Loaded 1000 items
# Exit shell
fastkv> exit
Goodbye!
Database closed
JSON Bulk Loading
Create a JSON file for bulk loading:
[
["key1", "value1"],
["key2", {"nested": "data"}],
["key3", [1, 2, 3]],
["user:1001", {"name": "Alice", "age": 30}],
["user:1002", {"name": "Bob", "age": 25}]
]
Then load it:
python -m fastkv shell --path ./mydb
fastkv> bulk data.json
Loaded 5 items
Benchmark Mode
The benchmark mode tests the database performance:
$ python -m fastkv benchmark
Running benchmark...
Test Mode
Run the built-in test suite:
$ python -m fastkv test
Running FastKV tests...
✓ Test 1 passed: Basic operations
✓ Test 2 passed: Crash recovery
✓ Test 3 passed: Async operations
All tests passed! ✅
Configuration
Database Options
from fastkv import FastKV, DurabilityMode
# Custom configuration
db = FastKV(
db_path="./my_data",
durability=DurabilityMode.SYNC, # SYNC, BACKGROUND, or NONE
max_memtable_size=128 * 1024 * 1024 # 128MB memtable
)
Durability Modes
DurabilityMode.NONE: Maximum performance, data may be lost on crashDurabilityMode.BACKGROUND(default): Good balance, async fsyncDurabilityMode.SYNC: Maximum durability, sync before return
Value Encoding
from fastkv import ValueEncoding
# Different serialization formats (default: JSON)
db.put("key", data, encoding=ValueEncoding.JSON)
db.put("key", data, encoding=ValueEncoding.MSGPACK) # Requires msgpack
db.put("key", data, encoding=ValueEncoding.PICKLE)
Advanced Usage
Bulk Loading via Python
For initial data import, use bulk loading for better performance:
# Generate sample data
items = [(f"item:{i}", {"id": i, "data": "x" * 100}) for i in range(100000)]
# Bulk load (bypasses WAL for speed)
db.bulk_load(items)
Manual Compaction
Compaction runs automatically in the background, but you can monitor it:
# The database automatically schedules compaction
# when certain thresholds are reached
stats = db.stats()
print(stats['sstable_stats']) # View SSTable distribution
Custom Serialization
import pickle
class CustomObject:
def __init__(self, data):
self.data = data
obj = CustomObject("test")
# Store custom objects
db.put("custom", obj, encoding=ValueEncoding.PICKLE)
# Retrieve
retrieved = db.get("custom")
print(type(retrieved)) # <class '__main__.CustomObject'>
Architecture
FastKV implements an LSM-tree storage engine with these components:
1. Write-Ahead Log (WAL)
- Ensures durability and crash recovery
- Segmented files with rotation
- Configurable sync modes
2. MemTable
- In-memory sorted key-value store
- Automatically flushed to disk when full
- Thread-safe with bisect-based ordering
3. Sorted String Tables (SSTables)
- Immutable sorted files on disk
- Block-based storage with Bloom filters
- Multi-level compaction strategy
4. Compaction
- Background merging of SSTables
- Level-based compaction policy
- Configurable parallelism
Performance Tips
- Use appropriate durability:
BACKGROUNDmode offers good balance for most use cases - Batch operations: Use
batch_put()for multiple writes - Bulk load initial data: Use
bulk_load()for initial imports - Monitor memory usage: Adjust
max_memtable_sizebased on available RAM - Use msgpack: Install msgpack for faster serialization
- Use CLI for quick operations: The shell is perfect for debugging and administration
API Reference
FastKV Class
class FastKV:
def __init__(self, db_path: Union[str, Path],
durability: DurabilityMode = DurabilityMode.BACKGROUND,
max_memtable_size: int = 64 * 1024 * 1024)
def put(self, key: str, value: Any,
encoding: ValueEncoding = ValueEncoding.JSON) -> None
def get(self, key: str) -> Optional[Any]
def delete(self, key: str) -> None
def batch_put(self, items: List[Tuple[str, Any]]) -> None
def scan(self, prefix: Optional[str] = None,
limit: Optional[int] = None) -> List[Tuple[str, Any]]
def stats(self) -> Dict[str, Any]
def bulk_load(self, items: List[Tuple[str, Any]]) -> None
def close(self) -> None
AsyncFastKV Class
class AsyncFastKV:
async def open(self) -> None
async def put(self, key: str, value: Any) -> None
async def get(self, key: str) -> Optional[Any]
async def delete(self, key: str) -> None
async def batch_put(self, items: List[Tuple[str, Any]]) -> None
async def scan(self, prefix: Optional[str] = None,
limit: Optional[int] = None) -> List[Tuple[str, Any]]
async def stats(self) -> Dict[str, Any]
async def bulk_load(self, items: List[Tuple[str, Any]]) -> None
async def close(self) -> None
Development Setup
# Clone the repository
git clone https://github.com/arifchy369/FastKV.git
cd fastkv
# Install in development mode
pip install -e .
# Run tests
python -m fastkv test
# Run benchmarks
python -m fastkv benchmark
# Start interactive shell
python -m fastkv shell
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
License
MIT License - see LICENSE file for details.
Support
- Issues: GitHub Issues
- Author: Arif Chowdhury (@arifchy369)
Roadmap
- Snapshot and backup functionality
- Transaction support
- Replication and clustering
- More compression algorithms (LZ4, Zstd)
- Query language support
- TTL (time-to-live) for keys
- Windows performance optimizations
FastKV - Fast, durable key-value storage for Python applications.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file fastkv-0.1.0.tar.gz.
File metadata
- Download URL: fastkv-0.1.0.tar.gz
- Upload date:
- Size: 25.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
24e29d97d7b444079d91a5523d0bdc3efab99a339416226115e10e757c65d653
|
|
| MD5 |
f0715d9f8663638b486c0e58b76955e0
|
|
| BLAKE2b-256 |
ad79444073f29a484f64b7f366602cdff65af775d1764071060d7a197b43b131
|
File details
Details for the file fastkv-0.1.0-py3-none-any.whl.
File metadata
- Download URL: fastkv-0.1.0-py3-none-any.whl
- Upload date:
- Size: 22.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dc816fc2abca26ebc42957f1a9f7c7486c213ed083442a0b31a5c7cb57964f07
|
|
| MD5 |
1ef87cf94673fa9982e5f4bbb2b77cff
|
|
| BLAKE2b-256 |
6fd4ad209902bd5497a885ed0227b18e5ff95f56af0989868fb1f9e25dd513bf
|