Skip to main content

A super fast filesystem-based key-value store.

Project description

FastKV

A high-performance, file-system based key-value database for Python

FastKV is a pure-Python, high-performance key-value database designed for durability, speed, and simplicity. It implements a log-structured merge-tree (LSM-tree) architecture similar to RocksDB/LevelDB, optimized for SSD storage with asynchronous compaction, bloom filters, and configurable durability modes.

Features

  • High Performance: Optimized for write-heavy workloads with sequential I/O patterns
  • ACID Compliant: Write-ahead logging (WAL) ensures crash recovery
  • Multiple Durability Modes: Choose between speed and safety
  • Asynchronous Operations: Built-in async API for non-blocking I/O
  • Efficient Storage: SSTable-based storage with Bloom filters and compression support
  • Memory Efficient: Configurable memtable sizes and background compaction
  • Thread-Safe: Designed for concurrent access
  • Zero Dependencies: Pure Python implementation (optional msgpack for better performance)
  • Command Line Interface: Built-in CLI for database operations

Installation

pip install fastkv

Quick Start

Synchronous API

from fastkv import FastKV

# Open a database
db = FastKV("./my_database")

# Store data
db.put("user:1001", {"name": "Alice", "age": 30, "email": "alice@example.com"})
db.put("user:1002", {"name": "Bob", "age": 25})
db.put("config:theme", "dark")
db.put("counter:visits", 42)

# Retrieve data
user = db.get("user:1001")
print(f"User: {user}")  # {"name": "Alice", "age": 30, "email": "alice@example.com"}

# Scan with prefix
users = db.scan("user:")
for key, value in users:
    print(f"{key}: {value}")

# Batch operations
db.batch_put([
    ("order:001", {"item": "Book", "price": 29.99}),
    ("order:002", {"item": "Pen", "price": 1.99})
])

# Delete keys
db.delete("config:theme")

# Get statistics
stats = db.stats()
print(f"Total keys: {stats['total_keys']}")
print(f"Memtable size: {stats['memtable_size']} bytes")

# Close the database
db.close()

Asynchronous API

import asyncio
from fastkv import AsyncFastKV

async def main():
    async with AsyncFastKV("./my_async_db") as db:
        # All operations are asynchronous
        await db.put("async_key", "async_value")
        value = await db.get("async_key")
        print(f"Got: {value}")
        
        # Batch operations
        await db.batch_put([("a", 1), ("b", 2), ("c", 3)])
        
        # Scan
        results = await db.scan(prefix="", limit=10)
        for key, val in results:
            print(f"{key}: {val}")

asyncio.run(main())

Command Line Interface (CLI)

FastKV provides a comprehensive command-line interface accessible via python -m fastkv. This allows you to run tests, benchmarks, and use an interactive shell without writing code.

Basic Usage

# Run tests
python -m fastkv test

# Run performance benchmarks
python -m fastkv benchmark

# Start interactive shell
python -m fastkv shell --path ./my_database

# Run benchmark with custom database path
python -m fastkv benchmark --path ./benchmark_data

# Start shell with specific database location
python -m fastkv shell --path /var/lib/fastkv/app_data

Interactive Shell

The interactive shell provides a REPL (Read-Eval-Print Loop) interface for database operations:

python -m fastkv shell --path ./my_database

Shell Commands:

Command Syntax Description Example
put put <key> <value> Store a key-value pair put user:1001 '{"name": "Alice", "age": 30}'
get get <key> Retrieve value by key get user:1001
delete delete <key> Remove a key delete user:1001
scan scan [prefix] [limit] Scan keys with prefix scan user: 10
stats stats Show database statistics stats
bulk bulk <filename.json> Bulk load from JSON file bulk data.json
exit exit or quit Exit the shell exit

Interactive Shell Examples:

# Start the shell
$ python -m fastkv shell --path ./testdb
Database opened at ./testdb
fastkv> 

# Store data
fastkv> put config:app_name "MyApp"
OK

fastkv> put user:1001 '{"name": "Alice", "active": true}'
OK

# Retrieve data
fastkv> get user:1001
{
  "name": "Alice",
  "active": true
}

# Scan with prefix
fastkv> scan user:
user:1001: {"name": "Alice", "active": true}
Total: 1 items

# Scan with limit
fastkv> scan "" 5
config:app_name: "MyApp"
user:1001: {"name": "Alice", "active": true}
Total: 2 items

# View statistics
fastkv> stats
{
  "memtable_size": 2048,
  "memtable_keys": 2,
  "immutable_memtables": 0,
  "immutable_keys": 0,
  "total_sstables": 0,
  "total_keys": 2,
  "sstable_stats": {},
  "seq_num": 2
}

# Bulk load from JSON file
fastkv> bulk data.json
Loaded 1000 items

# Exit shell
fastkv> exit
Goodbye!
Database closed

JSON Bulk Loading

Create a JSON file for bulk loading:

[
  ["key1", "value1"],
  ["key2", {"nested": "data"}],
  ["key3", [1, 2, 3]],
  ["user:1001", {"name": "Alice", "age": 30}],
  ["user:1002", {"name": "Bob", "age": 25}]
]

Then load it:

python -m fastkv shell --path ./mydb
fastkv> bulk data.json
Loaded 5 items

Benchmark Mode

The benchmark mode tests the database performance:

$ python -m fastkv benchmark
Running benchmark...

Test Mode

Run the built-in test suite:

$ python -m fastkv test
Running FastKV tests...
✓ Test 1 passed: Basic operations
✓ Test 2 passed: Crash recovery
✓ Test 3 passed: Async operations

All tests passed! 

Configuration

Database Options

from fastkv import FastKV, DurabilityMode

# Custom configuration
db = FastKV(
    db_path="./my_data",
    durability=DurabilityMode.SYNC,  # SYNC, BACKGROUND, or NONE
    max_memtable_size=128 * 1024 * 1024  # 128MB memtable
)

Durability Modes

  • DurabilityMode.NONE: Maximum performance, data may be lost on crash
  • DurabilityMode.BACKGROUND (default): Good balance, async fsync
  • DurabilityMode.SYNC: Maximum durability, sync before return

Value Encoding

from fastkv import ValueEncoding

# Different serialization formats (default: JSON)
db.put("key", data, encoding=ValueEncoding.JSON)
db.put("key", data, encoding=ValueEncoding.MSGPACK)  # Requires msgpack
db.put("key", data, encoding=ValueEncoding.PICKLE)

Advanced Usage

Bulk Loading via Python

For initial data import, use bulk loading for better performance:

# Generate sample data
items = [(f"item:{i}", {"id": i, "data": "x" * 100}) for i in range(100000)]

# Bulk load (bypasses WAL for speed)
db.bulk_load(items)

Manual Compaction

Compaction runs automatically in the background, but you can monitor it:

# The database automatically schedules compaction
# when certain thresholds are reached
stats = db.stats()
print(stats['sstable_stats'])  # View SSTable distribution

Custom Serialization

import pickle

class CustomObject:
    def __init__(self, data):
        self.data = data

obj = CustomObject("test")

# Store custom objects
db.put("custom", obj, encoding=ValueEncoding.PICKLE)

# Retrieve
retrieved = db.get("custom")
print(type(retrieved))  # <class '__main__.CustomObject'>

Architecture

FastKV implements an LSM-tree storage engine with these components:

1. Write-Ahead Log (WAL)

  • Ensures durability and crash recovery
  • Segmented files with rotation
  • Configurable sync modes

2. MemTable

  • In-memory sorted key-value store
  • Automatically flushed to disk when full
  • Thread-safe with bisect-based ordering

3. Sorted String Tables (SSTables)

  • Immutable sorted files on disk
  • Block-based storage with Bloom filters
  • Multi-level compaction strategy

4. Compaction

  • Background merging of SSTables
  • Level-based compaction policy
  • Configurable parallelism

Performance Tips

  1. Use appropriate durability: BACKGROUND mode offers good balance for most use cases
  2. Batch operations: Use batch_put() for multiple writes
  3. Bulk load initial data: Use bulk_load() for initial imports
  4. Monitor memory usage: Adjust max_memtable_size based on available RAM
  5. Use msgpack: Install msgpack for faster serialization
  6. Use CLI for quick operations: The shell is perfect for debugging and administration

API Reference

FastKV Class

class FastKV:
    def __init__(self, db_path: Union[str, Path], 
                 durability: DurabilityMode = DurabilityMode.BACKGROUND,
                 max_memtable_size: int = 64 * 1024 * 1024)
    
    def put(self, key: str, value: Any, 
            encoding: ValueEncoding = ValueEncoding.JSON) -> None
    
    def get(self, key: str) -> Optional[Any]
    
    def delete(self, key: str) -> None
    
    def batch_put(self, items: List[Tuple[str, Any]]) -> None
    
    def scan(self, prefix: Optional[str] = None, 
             limit: Optional[int] = None) -> List[Tuple[str, Any]]
    
    def stats(self) -> Dict[str, Any]
    
    def bulk_load(self, items: List[Tuple[str, Any]]) -> None
    
    def close(self) -> None

AsyncFastKV Class

class AsyncFastKV:
    async def open(self) -> None
    async def put(self, key: str, value: Any) -> None
    async def get(self, key: str) -> Optional[Any]
    async def delete(self, key: str) -> None
    async def batch_put(self, items: List[Tuple[str, Any]]) -> None
    async def scan(self, prefix: Optional[str] = None, 
                   limit: Optional[int] = None) -> List[Tuple[str, Any]]
    async def stats(self) -> Dict[str, Any]
    async def bulk_load(self, items: List[Tuple[str, Any]]) -> None
    async def close(self) -> None

Development Setup

# Clone the repository
git clone https://github.com/arifchy369/FastKV.git
cd fastkv

# Install in development mode
pip install -e .

# Run tests
python -m fastkv test

# Run benchmarks
python -m fastkv benchmark

# Start interactive shell
python -m fastkv shell

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

MIT License - see LICENSE file for details.

Support

Roadmap

  • Snapshot and backup functionality
  • Transaction support
  • Replication and clustering
  • More compression algorithms (LZ4, Zstd)
  • Query language support
  • TTL (time-to-live) for keys
  • Windows performance optimizations

FastKV - Fast, durable key-value storage for Python applications.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fastkv-0.1.0.tar.gz (25.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

fastkv-0.1.0-py3-none-any.whl (22.5 kB view details)

Uploaded Python 3

File details

Details for the file fastkv-0.1.0.tar.gz.

File metadata

  • Download URL: fastkv-0.1.0.tar.gz
  • Upload date:
  • Size: 25.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for fastkv-0.1.0.tar.gz
Algorithm Hash digest
SHA256 24e29d97d7b444079d91a5523d0bdc3efab99a339416226115e10e757c65d653
MD5 f0715d9f8663638b486c0e58b76955e0
BLAKE2b-256 ad79444073f29a484f64b7f366602cdff65af775d1764071060d7a197b43b131

See more details on using hashes here.

File details

Details for the file fastkv-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: fastkv-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 22.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for fastkv-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 dc816fc2abca26ebc42957f1a9f7c7486c213ed083442a0b31a5c7cb57964f07
MD5 1ef87cf94673fa9982e5f4bbb2b77cff
BLAKE2b-256 6fd4ad209902bd5497a885ed0227b18e5ff95f56af0989868fb1f9e25dd513bf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page