A lightweight Python wrapper for RocksDB using CFFI

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

prasadkumkar

These details have not been verified by PyPI

Project description

RockStore

A lightweight Python wrapper for RocksDB using CFFI.

Overview

RockStore provides a simple, Pythonic interface to RocksDB, Facebook's persistent key-value store. It uses CFFI for efficient native library bindings and focuses on clean binary data operations.

Features

Simple API: Easy-to-use Python interface for RocksDB operations
Binary Operations: Direct work with bytes for maximum performance
Context Manager: Automatic resource management with with statements
Configurable Options: Customize compression, buffer sizes, and more
Read-Only Mode: Open databases in read-only mode for safe concurrent access
Cross-Platform: Works on macOS, Linux, and Windows

Installation

Prerequisites

First, install RocksDB on your system:

macOS (using Homebrew):

brew install rocksdb

Ubuntu/Debian:

sudo apt-get install librocksdb-dev

CentOS/RHEL/Fedora:

sudo yum install rocksdb-devel
# or for newer versions:
sudo dnf install rocksdb-devel

Windows:

Download pre-built RocksDB binaries or build from source
Ensure rocksdb.dll is in your PATH

Install RockStore

pip install rockstore

Quick Start

Basic Usage

from rockstore import RockStore

# Open a database
db = RockStore('/path/to/database')

# Store and retrieve binary data
db.put(b'key1', b'value1')
value = db.get(b'key1')
print(value)  # b'value1'

# Store and retrieve string data (encode/decode manually)
db.put('name'.encode(), 'Alice'.encode())
name = db.get('name'.encode()).decode()
print(name)  # 'Alice'

# Delete data
db.delete(b'key1')

# Clean up
db.close()

Using Context Manager (Recommended)

from rockstore import open_database

with open_database('/path/to/database') as db:
    db.put(b'hello', b'world')
    value = db.get(b'hello')
    print(value)  # b'world'
# Database is automatically closed

Getting All Data

with open_database('/path/to/database') as db:
    db.put(b'key1', b'value1')
    db.put(b'key2', b'value2')
    
    # Get all key-value pairs (warning: loads everything into memory)
    all_data = db.get_all()
    for key, value in all_data.items():
        print(f"{key} -> {value}")

Batch Operations

For maximum performance when writing multiple records, use write_batch. It is significantly faster (4x+) than individual put operations and guarantees atomicity (all or nothing).

with open_database('/path/to/database') as db:
    # Prepare batch data (list of tuples)
    batch_data = [
        (b'key1', b'value1'),
        (b'key2', b'value2'),
        (b'key3', b'value3')
    ]
    
    # Atomic write
    db.write_batch(batch_data)
    
    # Atomic delete
    keys_to_delete = [b'key1', b'key2']
    db.delete_batch(keys_to_delete)

Range Queries and Pagination

For large databases, use range queries with pagination instead of get_all():

with open_database('/path/to/database') as db:
    # Add sample data
    for i in range(10000):
        key = f"user:{i:06d}".encode()
        value = f"User {i}".encode()
        db.put(key, value)
    
    # Paginated access - get 1000 records at a time
    batch_size = 1000
    start_key = None
    
    while True:
        # Get next batch
        batch = db.get_range(start_key=start_key, limit=batch_size)
        if not batch:
            break
            
        print(f"Processing {len(batch)} records...")
        
        # Process the batch
        for key, value in batch.items():
            process_record(key, value)
        
        # Setup for next batch
        last_key = max(batch.keys())
        start_key = last_key + b'\x00'  # Next key after last_key
    
    # Query specific ranges
    user_data = db.get_range(
        start_key=b'user:', 
        end_key=b'user:\xFF', 
        limit=500
    )
    
    # Memory-efficient iteration (one record at a time)
    for key, value in db.iterate_range(start_key=b'user:', end_key=b'user:\xFF'):
        process_user(key, value)

Handling 10M+ Record Databases

For very large databases (10M+ records), here's how to efficiently paginate in 100K batches:

def process_large_database_in_batches(db_path, batch_size=100_000):
    """
    Process a large database (10M+ records) in manageable batches.
    This approach uses constant memory regardless of database size.
    """
    with open_database(db_path) as db:
        start_key = None
        total_processed = 0
        batch_count = 0
        
        while True:
            # Get next batch
            batch = db.get_range(start_key=start_key, limit=batch_size)
            if not batch:
                break
            
            batch_count += 1
            total_processed += len(batch)
            
            print(f"Processing batch {batch_count}: {len(batch)} records")
            print(f"Total processed so far: {total_processed}")
            
            # Process each record in the batch
            for key, value in batch.items():
                # Your processing logic here
                process_record(key, value)
            
            # Prepare for next batch
            last_key = max(batch.keys())
            start_key = last_key + b'\x00'
            
            # Optional: Add progress tracking or break conditions
            if total_processed >= 10_000_000:  # Safety limit
                break
        
        print(f"Completed! Processed {total_processed} records in {batch_count} batches")

# Even more memory-efficient approach using iterator
def stream_process_large_database(db_path):
    """
    Stream process records one at a time - ultimate memory efficiency.
    """
    with open_database(db_path) as db:
        processed = 0
        for key, value in db.iterate_range():
            process_record(key, value)
            processed += 1
            
            if processed % 100_000 == 0:
                print(f"Processed {processed} records...")

Working with Strings

# Helper functions for string encoding/decoding
def encode_string(s):
    return s.encode('utf-8')

def decode_bytes(b):
    return b.decode('utf-8')

with open_database('/path/to/database') as db:
    # Store string data
    db.put(encode_string('user:123'), encode_string('John Doe'))
    
    # Retrieve and decode
    user_data = db.get(encode_string('user:123'))
    if user_data:
        print(decode_bytes(user_data))  # 'John Doe'

Configuration Options

from rockstore import RockStore

# Create database with custom options
options = {
    'create_if_missing': True,
    'compression_type': 'lz4_compression',
    'write_buffer_size': 64 * 1024 * 1024,  # 64MB
    'max_open_files': 1000
}

db = RockStore('/path/to/database', options=options)

Available Options

create_if_missing (bool): Create database if it doesn't exist (default: True)
read_only (bool): Open database in read-only mode (default: False)
compression_type (str): Compression algorithm - 'no_compression', 'snappy_compression', 'zlib_compression', 'bz2_compression', 'lz4_compression', 'lz4hc_compression', 'xpress_compression', 'zstd_compression' (default: 'snappy_compression')
write_buffer_size (int): Write buffer size in bytes (default: 64MB)
max_open_files (int): Maximum number of open files (default: 1000)

Per-Operation Options

# Synchronous write (forces immediate disk write)
db.put(b'key', b'value', sync=True)

# Read without caching
value = db.get(b'key', fill_cache=False)

# Synchronous delete
db.delete(b'key', sync=True)

API Reference

RockStore Class

Constructor

RockStore(path, options=None)

Methods

Binary Operations:

put(key: bytes, value: bytes, sync: bool = False) - Store binary data
get(key: bytes, fill_cache: bool = True) -> bytes | None - Retrieve binary data
delete(key: bytes, sync: bool = False) - Delete binary data

Batch Operations:

write_batch(operations: list[tuple[bytes, bytes]], sync: bool = False) - Atomically write multiple key-value pairs
delete_batch(keys: list[bytes], sync: bool = False) - Atomically delete multiple keys

Bulk Read Operations:

get_all(fill_cache: bool = True) -> dict[bytes, bytes] - Get all key-value pairs (loads into memory)
get_range(start_key: bytes = None, end_key: bytes = None, limit: int = None, fill_cache: bool = True) -> dict[bytes, bytes] - Get range of key-value pairs with pagination support
iterate_range(start_key: bytes = None, end_key: bytes = None, fill_cache: bool = True) -> Iterator[tuple[bytes, bytes]] - Memory-efficient iterator over key-value pairs

Resource Management:

close() - Close the database
Context manager support (with statement)

Context Manager

open_database(path, options=None) -> RockStore

Requirements

Python 3.8+
CFFI >= 1.15.0
RocksDB library installed on system

Development

Running Tests

# Install development dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Run with coverage
pytest --cov=rockstore

License

MIT License - see LICENSE file for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Developed by Chainscore Labs

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

prasadkumkar

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.2.0

Jan 11, 2026

0.1.2

May 31, 2025

0.1.1

May 31, 2025

0.1.0

May 31, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rockstore-0.2.0.tar.gz (21.3 kB view details)

Uploaded Jan 11, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rockstore-0.2.0-py3-none-any.whl (13.0 kB view details)

Uploaded Jan 11, 2026 Python 3

File details

Details for the file rockstore-0.2.0.tar.gz.

File metadata

Download URL: rockstore-0.2.0.tar.gz
Upload date: Jan 11, 2026
Size: 21.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rockstore-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`17336ec18687a24568529513d0d42d18d92031c2386b7fea2153cff47b3561fd`
MD5	`e2de1f1514439c932a1741a43b2efa23`
BLAKE2b-256	`5fab119e7aa5e80e59b4924625f4922c7de45d6f90a0bfc3784c0644735e9f8a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for rockstore-0.2.0.tar.gz:

Publisher: publish.yml on Chainscore/rockstore

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: rockstore-0.2.0.tar.gz
- Subject digest: 17336ec18687a24568529513d0d42d18d92031c2386b7fea2153cff47b3561fd
- Sigstore transparency entry: 813561055
- Sigstore integration time: Jan 11, 2026
Source repository:
- Permalink: Chainscore/rockstore@a2cba1fea611f99b0c29987e0655c69364dccee5
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/Chainscore
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@a2cba1fea611f99b0c29987e0655c69364dccee5
- Trigger Event: push

File details

Details for the file rockstore-0.2.0-py3-none-any.whl.

File metadata

Download URL: rockstore-0.2.0-py3-none-any.whl
Upload date: Jan 11, 2026
Size: 13.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for rockstore-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`dd10fe26352701e8bed56a835578af6728b8635ea8c724682e557d6a9999e65a`
MD5	`98a7b2e7ddedd7dba0a3ec31d0c1d133`
BLAKE2b-256	`f19803ebe3ab3b573714c398593fc79d67e53f0fac7824e9d66ae5242acadda9`

See more details on using hashes here.

Provenance

The following attestation bundles were made for rockstore-0.2.0-py3-none-any.whl:

Publisher: publish.yml on Chainscore/rockstore

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: rockstore-0.2.0-py3-none-any.whl
- Subject digest: dd10fe26352701e8bed56a835578af6728b8635ea8c724682e557d6a9999e65a
- Sigstore transparency entry: 813561058
- Sigstore integration time: Jan 11, 2026
Source repository:
- Permalink: Chainscore/rockstore@a2cba1fea611f99b0c29987e0655c69364dccee5
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/Chainscore
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@a2cba1fea611f99b0c29987e0655c69364dccee5
- Trigger Event: push

rockstore 0.2.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Project description

RockStore

Overview

Features

Installation

Prerequisites

Install RockStore

Quick Start

Basic Usage

Using Context Manager (Recommended)

Getting All Data

Batch Operations

Range Queries and Pagination

Handling 10M+ Record Databases

Working with Strings

Configuration Options

Available Options

Per-Operation Options

API Reference

RockStore Class

Constructor

Methods

Context Manager

Requirements

Development

Running Tests

License

Contributing

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Meta

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance