FerrumDB — Zero-setup embedded document database for Python. No server. No config. Just open a file and go.
Project description
⚡ FerrumDB
A high-performance, embedded document database written from scratch in Rust.
No server. No config files. No migrations. Open a file and go.
What is FerrumDB?
FerrumDB is an embedded key-value database engine built in Rust, designed for applications that need fast local persistence without the overhead of a server process. It is inspired by Bitcask and implements a custom binary log format, in-memory indexing, AES-256-GCM encryption at rest, atomic transactions, and a live web dashboard — all in ~1,000 lines of safe, async Rust.
It ships Python bindings via PyO3 (pip install ferrumdb) and Node.js bindings via NAPI-RS (npm install ferrumdb).
🌟 Features
| Feature | Detail |
|---|---|
| ⚡ O(1) reads & writes | Append-only log + in-memory HashMap index rebuilt on startup |
| 📄 Native JSON documents | Store any structured data; values are serde_json::Value |
| 🔍 Secondary indexing | O(1) field lookups via create_index() — maintained live on writes |
| 🔐 AES-256-GCM encryption | Per-block encryption with random nonces; data is protected at rest |
| ⚛️ Atomic transactions | All-or-nothing batches written as a single log entry |
| ⏱️ Configurable fsync policy | Always / Periodic(ms) / Never — tune durability vs. throughput |
| 🖥️ Ferrum Studio | Built-in web dashboard (Axum) at localhost:7474 |
| 🐍 Python bindings | pip install ferrumdb — no Rust toolchain required |
| 🛡️ Crash resilience | Log compaction via atomic rename(); incomplete records are skipped |
| 📊 Observability | Lock-free atomic metrics: ops/sec, uptime, GET/SET/DELETE counts |
🏗️ Architecture
FerrumDB was built ground-up without using an existing storage library. Every layer is custom:
┌─────────────────────────────────────────┐
│ FerrumDB API │ ← High-level Rust & Python interface
├─────────────────────────────────────────┤
│ StorageEngine │ ← Core engine: index + log management
│ ┌─────────────────┐ ┌──────────────┐ │
│ │ In-Memory Index │ │ Secondary │ │
│ │ HashMap<K,V> │ │ Indexes │ │
│ │ RwLock async │ │ HashMap<F,V> │ │
│ └────────┬────────┘ └──────────────┘ │
│ │ append / reads │
│ ┌────────▼────────────────────────┐ │
│ │ Append-Only Log (AOF) │ │ ← Bitcask-inspired binary format
│ │ [len: u64][JSON bytes]... │ │ length-prefixed, sequential
│ └────────┬────────────────────────┘ │
├───────────┼─────────────────────────────┤
│ ┌────────▼────────────────────────┐ │
│ │ AsyncFileSystem trait │ │ ← Pluggable I/O abstraction
│ │ ┌──────────┐ ┌─────────────┐ │ │
│ │ │ Disk │ │ Encrypted │ │ │ ← Decorator pattern
│ │ │ (tokio) │ │ (AES-GCM) │ │ │ random nonce per block
│ │ └──────────┘ └─────────────┘ │ │
│ └─────────────────────────────────┘ │
└─────────────────────────────────────────┘
Key design decisions:
- Bitcask AOF: Writes are append-only (fast, sequential I/O). The in-memory index is the source of truth for reads. On startup, the engine replays the log to rebuild state — making recovery deterministic and crash-safe.
- Pluggable
AsyncFileSystemtrait: The I/O layer is fully abstracted.DiskFileSystemandEncryptedFileSystemimplement the same trait — swapped via the decorator pattern. This makes the storage engine 100% testable without touching disk. - AES-256-GCM per block: Each binary record is individually encrypted with a cryptographically random 12-byte nonce. The nonce is stored alongside the ciphertext. GCM authentication tags detect any file tampering.
- Tokio async throughout: Reads use
RwLock(many concurrent readers), writes serialize via write lock. Metrics useAtomicU64— no lock contention on the hot path. - Log compaction: A background
compact()rewrites only live (non-expired, non-deleted) records to a temp file, then swaps atomically viarename()— POSIX-atomic, no data loss possible.
⚙️ Technical Stack
| Component | Technology |
|---|---|
| Language | Rust (2021 edition) |
| Async runtime | Tokio |
| Serialization | serde + serde_json |
| Encryption | aes-gcm (AES-256-GCM) |
| Web dashboard | Axum |
| Python bindings | PyO3 (via maturin) |
| Benchmarking | Criterion |
| Testing | tokio::test + tempfile |
📊 Performance
Benchmarked with Criterion on an append-only log with FsyncPolicy::Never (max throughput):
| Operation | Performance |
|---|---|
Single SET |
~1–3 µs |
Single GET (in-memory) |
< 1 µs |
1,000 sequential SETs |
~2–5 ms |
100 concurrent SETs (Tokio tasks) |
~3–8 ms |
| Secondary index query (100 docs) | < 1 µs |
Run benchmarks yourself:
cargo bench
🐍 Python Installation & Usage
FerrumDB is available on PyPI. Install it using pip:
pip install ferrumdb
from ferrumdb import FerrumDB
# Zero-setup: creates myapp.db if it doesn't exist
db = FerrumDB.open("myapp.db")
# Store any JSON-serializable value
db.set("user:1", '{"name": "alice", "role": "admin", "score": 99}')
db.set("user:2", '{"name": "bob", "role": "user", "score": 45}')
# Read back
print(db.get("user:1")) # {"name": "alice", "role": "admin", "score": 99}
print(db.count()) # 2
print(db.keys()) # ["user:1", "user:2"]
# Secondary indexing — O(1) field lookups
db.create_index("role")
admins = db.find("role", '"admin"') # => ["user:1"]
# Delete
db.delete("user:2")
🦀 Rust Installation & Usage
FerrumDB is available on crates.io. Add it to your project:
cargo add ferrumdb
cargo add tokio -F full
cargo add serde_json
Or manually add to your Cargo.toml:
[dependencies]
ferrumdb = "0.1.1"
tokio = { version = "1", features = ["full"] }
serde_json = "1"
use ferrumdb::{FerrumDB, Config, Transaction, FsyncPolicy};
use serde_json::json;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Standard open (zero-setup, uses ferrum.db)
let db = FerrumDB::open_default().await?;
// Store documents
db.set("user:1".into(), json!({"name": "alice", "role": "admin"})).await?;
// Secondary index query
db.create_index("role").await?;
let admins = db.find("role", &json!("admin")).await;
// Atomic transaction
let tx = Transaction::new()
.set("k1".into(), json!({"tag": "blue"}))
.set("k2".into(), json!({"tag": "red"}))
.delete("k1".into());
db.commit(tx).await?;
// Encrypted database (AES-256-GCM, random nonce per block)
let key: [u8; 32] = *b"my_super_secret_key_32_bytes_!!?";
let db_enc = FerrumDB::open(
Config::new()
.with_encryption(key)
.with_fsync_policy(FsyncPolicy::Periodic(std::time::Duration::from_millis(100)))
).await?;
Ok(())
}
🖥️ Ferrum Studio
Ferrum Studio is a built-in web dashboard to browse, query, and inspect your database with real-time metrics.
Option 1 — Via the REPL (auto-launches when you cargo run):
cargo run --release
# 🔥 Ferrum Studio → http://localhost:7474
Option 2 — Standalone CLI (works with any .db file, any language):
cargo install ferrumdb-cli
ferrumdb web myapp.db # opens http://localhost:7474
ferrumdb web myapp.db --port 8080 # custom port
ferrumdb info myapp.db # show key count & file size
ferrumdb compact myapp.db # remove deleted/expired entries
The CLI works regardless of whether you use the Rust, Python, or Node.js bindings — just point it at your .db file.
🖥️ CLI REPL
cargo run
cargo run -- --fsync=always # strongest durability
| Command | Description |
|---|---|
SET <key> <json> |
Store a document |
GET <key> |
Retrieve and pretty-print |
DELETE <key> |
Remove a key |
KEYS |
List all keys |
COUNT |
Total number of entries |
INDEX <field> |
Create secondary index on JSON field |
FIND <field> <value> |
Query by indexed field |
HELP |
Show commands + live session metrics |
📂 Examples
Full working examples for each language are in the examples/ directory:
| Example | Language | Description | Run |
|---|---|---|---|
| rust-example | Rust | Task Manager — CRUD, secondary indexes, transactions, TTL | cd examples/rust-example && cargo run |
| python-example | Python | Contact Book — CRUD, secondary indexes, transactions | cd examples/python-example && python main.py |
| node-example | Node.js | Note Taker — CRUD, secondary indexes, transactions | cd examples/node-example && node main.mjs |
Each example is self-contained and demonstrates the core FerrumDB API in its respective language.
⚠️ Known Limitations
FerrumDB optimizes for simplicity and embedded use cases. Understand the trade-offs:
| Limitation | Reason | Workaround |
|---|---|---|
| Entire index in RAM | O(1) reads require full HashMap in memory |
Best for databases < 1 GB |
| Single-writer only | Append-only log has no cross-process lock protocol | One process per DB file |
| No range queries | Secondary indexes store exact value matches | Use Tantivy for range scans |
| No nested field indexes | Indexes only top-level JSON keys | Flatten documents before storing |
| Blocking compaction | Rewrites entire log — hold write lock | Schedule during low-traffic |
| No WAL / MVCC | Simpler append-only design | Accept occasional contention |
| No replication | Single-file, embedded design | Handle replication at app level |
Best for: local-first apps, desktop tools, embedded caching, session/config stores, write-heavy workloads.
Not for: large datasets (> 1 GB), complex queries (JOINs, aggregations), multi-writer or distributed scenarios.
Environment Config
set FERRUMDB_FSYNC=always # sync every write (safest)
set FERRUMDB_FSYNC=never # never sync (fastest)
set FERRUMDB_FSYNC=periodic:200 # sync every 200ms
let db = FerrumDB::open_from_env().await?;
📋 Changelog
See CHANGELOG.md for a full list of changes per version.
📝 License
MIT — see LICENSE for details.
Built with 🦀 by Muhammad Usman
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ferrumdb-0.1.2.tar.gz.
File metadata
- Download URL: ferrumdb-0.1.2.tar.gz
- Upload date:
- Size: 59.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.12.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aab256be94b37b2c9ff9228a262f6e16a9dafd26d89f1d5e3ed561a974f3a4c9
|
|
| MD5 |
3a8e331f4552d0e0573ea7d50b7ff0a6
|
|
| BLAKE2b-256 |
8cc0be19b376254b02e1fd26b2d93321ce63a864b5b0eae2f8cdf9c0c0aaf917
|
File details
Details for the file ferrumdb-0.1.2-cp313-cp313-win_amd64.whl.
File metadata
- Download URL: ferrumdb-0.1.2-cp313-cp313-win_amd64.whl
- Upload date:
- Size: 554.7 kB
- Tags: CPython 3.13, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.12.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
07c1e4da0cceeabdb235c6a7ce7034a1d7377d18ebd1e9edd4a99ddbc08b4163
|
|
| MD5 |
d836c62e82c9207739f404bcff9055db
|
|
| BLAKE2b-256 |
c0a84a049fcf5c7273ad022dd40491125a856a41874729c8ab5ffe552c328cd6
|