FerrumDB — Zero-setup embedded document database for Python. No server. No config. Just open a file and go.

These details have not been verified by PyPI

Project links

Project description

⚡ FerrumDB

A high-performance, embedded document database written from scratch in Rust.
No server. No config files. No migrations. Open a file and go.

What is FerrumDB?

FerrumDB is an embedded key-value database engine built in Rust, designed for applications that need fast local persistence without the overhead of a server process. It is inspired by Bitcask and implements a custom binary log format, in-memory indexing, AES-256-GCM encryption at rest, atomic transactions, and a live web dashboard — all in ~1,000 lines of safe, async Rust.

It also ships Python bindings via PyO3, making it accessible from Python with a single pip install.

🌟 Features

Feature	Detail
⚡ O(1) reads & writes	Append-only log + in-memory `HashMap` index rebuilt on startup
📄 Native JSON documents	Store any structured data; values are `serde_json::Value`
🔍 Secondary indexing	O(1) field lookups via `create_index()` — maintained live on writes
🔐 AES-256-GCM encryption	Per-block encryption with random nonces; data is protected at rest
⚛️ Atomic transactions	All-or-nothing batches written as a single log entry
⏱️ Configurable fsync policy	`Always` / `Periodic(ms)` / `Never` — tune durability vs. throughput
🖥️ Ferrum Studio	Built-in web dashboard (Axum) at `localhost:7474`
🐍 Python bindings	`pip install ferrumdb` — no Rust toolchain required
🛡️ Crash resilience	Log compaction via atomic `rename()`; incomplete records are skipped
📊 Observability	Lock-free atomic metrics: ops/sec, uptime, GET/SET/DELETE counts

🏗️ Architecture

FerrumDB was built ground-up without using an existing storage library. Every layer is custom:

┌─────────────────────────────────────────┐
│                FerrumDB API              │  ← High-level Rust & Python interface
├─────────────────────────────────────────┤
│             StorageEngine               │  ← Core engine: index + log management
│  ┌─────────────────┐  ┌──────────────┐  │
│  │  In-Memory Index │  │ Secondary    │  │
│  │  HashMap<K,V>   │  │ Indexes      │  │
│  │  RwLock async   │  │ HashMap<F,V> │  │
│  └────────┬────────┘  └──────────────┘  │
│           │ append / reads              │
│  ┌────────▼────────────────────────┐    │
│  │   Append-Only Log (AOF)         │    │  ← Bitcask-inspired binary format
│  │   [len: u64][JSON bytes]...     │    │     length-prefixed, sequential
│  └────────┬────────────────────────┘    │
├───────────┼─────────────────────────────┤
│  ┌────────▼────────────────────────┐    │
│  │  AsyncFileSystem trait          │    │  ← Pluggable I/O abstraction
│  │  ┌──────────┐  ┌─────────────┐  │    │
│  │  │   Disk   │  │  Encrypted  │  │    │  ← Decorator pattern
│  │  │  (tokio) │  │  (AES-GCM)  │  │    │     random nonce per block
│  │  └──────────┘  └─────────────┘  │    │
│  └─────────────────────────────────┘    │
└─────────────────────────────────────────┘

Key design decisions:

Bitcask AOF: Writes are append-only (fast, sequential I/O). The in-memory index is the source of truth for reads. On startup, the engine replays the log to rebuild state — making recovery deterministic and crash-safe.
Pluggable AsyncFileSystem trait: The I/O layer is fully abstracted. DiskFileSystem and EncryptedFileSystem implement the same trait — swapped via the decorator pattern. This makes the storage engine 100% testable without touching disk.
AES-256-GCM per block: Each binary record is individually encrypted with a cryptographically random 12-byte nonce. The nonce is stored alongside the ciphertext. GCM authentication tags detect any file tampering.
Tokio async throughout: Reads use RwLock (many concurrent readers), writes serialize via write lock. Metrics use AtomicU64 — no lock contention on the hot path.
Log compaction: A background compact() rewrites only live (non-expired, non-deleted) records to a temp file, then swaps atomically via rename() — POSIX-atomic, no data loss possible.

⚙️ Technical Stack

Component	Technology
Language	Rust (2021 edition)
Async runtime	Tokio
Serialization	serde + serde_json
Encryption	aes-gcm (AES-256-GCM)
Web dashboard	Axum
Python bindings	PyO3 (via maturin)
Benchmarking	Criterion
Testing	tokio::test + tempfile

📊 Performance

Benchmarked with Criterion on an append-only log with FsyncPolicy::Never (max throughput):

Operation	Performance
Single `SET`	~1–3 µs
Single `GET` (in-memory)	< 1 µs
1,000 sequential `SET`s	~2–5 ms
100 concurrent `SET`s (Tokio tasks)	~3–8 ms
Secondary index query (100 docs)	< 1 µs

Run benchmarks yourself: cargo bench

🐍 Python Installation & Usage

FerrumDB is available on PyPI. Install it using pip:

pip install ferrumdb

from ferrumdb import FerrumDB

# Zero-setup: creates myapp.db if it doesn't exist
db = FerrumDB.open("myapp.db")

# Store any JSON-serializable value
db.set("user:1", '{"name": "alice", "role": "admin", "score": 99}')
db.set("user:2", '{"name": "bob",   "role": "user",  "score": 45}')

# Read back
print(db.get("user:1"))       # {"name": "alice", "role": "admin", "score": 99}
print(db.count())             # 2
print(db.keys())              # ["user:1", "user:2"]

# Secondary indexing — O(1) field lookups
db.create_index("role")
admins = db.find("role", '"admin"')   # => ["user:1"]

# Delete
db.delete("user:2")

🦀 Rust Installation & Usage

FerrumDB is available on crates.io. Add it to your project:

cargo add ferrumdb
cargo add tokio -F full
cargo add serde_json

Or manually add to your Cargo.toml:

[dependencies]
ferrumdb = "0.1.0"
tokio = { version = "1", features = ["full"] }
serde_json = "1"

use ferrumdb::{FerrumDB, Config, Transaction, FsyncPolicy};
use serde_json::json;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Standard open (zero-setup, uses ferrum.db)
    let db = FerrumDB::open_default().await?;

    // Store documents
    db.set("user:1".into(), json!({"name": "alice", "role": "admin"})).await?;

    // Secondary index query
    db.create_index("role").await?;
    let admins = db.find("role", &json!("admin")).await;

    // Atomic transaction
    let tx = Transaction::new()
        .set("k1".into(), json!({"tag": "blue"}))
        .set("k2".into(), json!({"tag": "red"}))
        .delete("k1".into());
    db.commit(tx).await?;

    // Encrypted database (AES-256-GCM, random nonce per block)
    let key: [u8; 32] = *b"my_super_secret_key_32_bytes_!!?";
    let db_enc = FerrumDB::open(
        Config::new()
            .with_encryption(key)
            .with_fsync_policy(FsyncPolicy::Periodic(std::time::Duration::from_millis(100)))
    ).await?;

    Ok(())
}

🖥️ Ferrum Studio

When you run the REPL, Ferrum Studio auto-launches — an embedded web dashboard to browse, query, and inspect your live database, including real-time operation metrics.

cargo run --release
# 🔥 Ferrum Studio → http://localhost:7474

🖥️ CLI REPL

cargo run
cargo run -- --fsync=always   # strongest durability

Command	Description
`SET <key> <json>`	Store a document
`GET <key>`	Retrieve and pretty-print
`DELETE <key>`	Remove a key
`KEYS`	List all keys
`COUNT`	Total number of entries
`INDEX <field>`	Create secondary index on JSON field
`FIND <field> <value>`	Query by indexed field
`HELP`	Show commands + live session metrics

⚠️ Known Limitations

FerrumDB optimizes for simplicity and embedded use cases. Understand the trade-offs:

Limitation	Reason	Workaround
Entire index in RAM	O(1) reads require full `HashMap` in memory	Best for databases < 1 GB
Single-writer only	Append-only log has no cross-process lock protocol	One process per DB file
No range queries	Secondary indexes store exact value matches	Use Tantivy for range scans
No nested field indexes	Indexes only top-level JSON keys	Flatten documents before storing
Blocking compaction	Rewrites entire log — hold write lock	Schedule during low-traffic
No WAL / MVCC	Simpler append-only design	Accept occasional contention
No replication	Single-file, embedded design	Handle replication at app level

Best for: local-first apps, desktop tools, embedded caching, session/config stores, write-heavy workloads.

Not for: large datasets (> 1 GB), complex queries (JOINs, aggregations), multi-writer or distributed scenarios.

Environment Config

set FERRUMDB_FSYNC=always        # sync every write (safest)
set FERRUMDB_FSYNC=never         # never sync (fastest)
set FERRUMDB_FSYNC=periodic:200  # sync every 200ms

let db = FerrumDB::open_from_env().await?;

📝 License

MIT — see LICENSE for details.

Built with 🦀 by Muhammad Usman

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.1.3

Apr 4, 2026

0.1.2

Apr 4, 2026

This version

0.1.1

Mar 30, 2026

0.1.0

Mar 13, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ferrumdb-0.1.1.tar.gz (56.9 kB view details)

Uploaded Mar 30, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ferrumdb-0.1.1-cp313-cp313-win_amd64.whl (517.6 kB view details)

Uploaded Mar 30, 2026 CPython 3.13Windows x86-64

File details

Details for the file ferrumdb-0.1.1.tar.gz.

File metadata

Download URL: ferrumdb-0.1.1.tar.gz
Upload date: Mar 30, 2026
Size: 56.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: maturin/1.12.6

File hashes

Hashes for ferrumdb-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`27be8f18070f8da133bc5dbbd3cb8c1376d163b2a08cdc612dc8de218518096e`
MD5	`da5542b8d971b94e68fb7f4a315496ba`
BLAKE2b-256	`45b00991ed81f49d432f5bccf2c0bd45fda77c9b2cdfa2560ebe6348208b390b`

See more details on using hashes here.

File details

Details for the file ferrumdb-0.1.1-cp313-cp313-win_amd64.whl.

File metadata

Download URL: ferrumdb-0.1.1-cp313-cp313-win_amd64.whl
Upload date: Mar 30, 2026
Size: 517.6 kB
Tags: CPython 3.13, Windows x86-64
Uploaded using Trusted Publishing? No
Uploaded via: maturin/1.12.6

File hashes

Hashes for ferrumdb-0.1.1-cp313-cp313-win_amd64.whl
Algorithm	Hash digest
SHA256	`3fef37103b0c5c1c220a093a53a5b5d8dea06bfa298f1028d1f8792b55e00f54`
MD5	`96aada4deb2c4cca0f03395f9c45cc06`
BLAKE2b-256	`43acde52727df053cf235a7f8bc2e44e76c021b7f596eb1daf05f137fed9c40d`

See more details on using hashes here.

ferrumdb 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

⚡ FerrumDB

What is FerrumDB?

🌟 Features

🏗️ Architecture

⚙️ Technical Stack

📊 Performance

🐍 Python Installation & Usage

🦀 Rust Installation & Usage

🖥️ Ferrum Studio

🖥️ CLI REPL

⚠️ Known Limitations

Environment Config

📝 License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes