Skip to main content

An absurdly fast embedded graph database for Python. Sub-microsecond reads, pure Rust core.

Project description

OverGraph

An absurdly fast embedded graph database.
Pure Rust. Sub-microsecond reads. Native connectors for Node.js and Python.

License CI


OverGraph is a graph database that runs inside your process. No server, no network calls, no Docker containers. You open a directory, and you have a full graph database with temporal edges, weighted relationships, and sub-microsecond lookups.

I built it because I wanted a graph database that was genuinely fast. Not "fast for a database," but fast enough that you forget it's there. Node lookups in 200 nanoseconds. Neighbor traversals in 2 microseconds. Batch writes at 600K+ nodes per second. And I wanted it to work everywhere I work: Rust, Node.js, and Python, all backed by the same engine.

It's written entirely in Rust and it ships native connectors for Node.js (napi-rs) and Python (PyO3) so you can use it from whatever you're building in.

What makes it different

  • Truly embedded. No separate process. No socket. Your database is a folder on disk. Copy it, move it, back it up with cp -r.
  • Rich graph primitives. Weighted nodes and edges, temporal validity windows, exponential decay scoring, automatic retention policies. Model relationships that evolve over time, and let the graph clean up what's no longer relevant.
  • Fast where it matters. Node lookups in ~200ns. Neighbor traversal in ~2μs. Batch writes at 600K+ nodes/sec. The storage engine is a log-structured merge tree with mmap'd immutable segments, so reads never block writes.
  • Three languages, one engine. Rust core with native bindings for Node.js (napi-rs) and Python (PyO3). Not a wrapper around a REST API. Actual FFI into the same Rust engine, with <3x overhead on most operations.
  • No query language. Just functions. upsert_node, neighbors, find_nodes. If you can call a function, you can use OverGraph.

Performance

All numbers from a real benchmark suite running on the Rust core (group-commit durability mode, small profile: 10K nodes / 50K edges). Full methodology and reproducibility guide in docs/04-quality/Benchmark-Methodology.md.

Operation Latency Throughput
get_node 0.2 μs 5.5M ops/s
upsert_node 2.2 μs 807K ops/s
neighbors (1-hop) 2.1 μs 541K ops/s
batch_upsert_nodes (100) 236 μs 625K nodes/s
top_k_neighbors 17.5 μs 81K ops/s
personalized_pagerank 254 μs 4.3K ops/s

Node.js and Python connectors add roughly 1.5-3x overhead depending on the operation. Batch operations are nearly 1:1 because they amortize the FFI boundary cost. Full cross-language comparison in the launch benchmark pack.

Install

Prebuilt binaries are available for macOS (ARM + Intel), Linux (x64), and Windows (x64). No Rust toolchain required.

Rust

cargo add overgraph

Node.js

npm install overgraph

Python

pip install overgraph

Quick start

Rust

use overgraph::{DatabaseEngine, DbOptions, Direction, PropValue};
use std::collections::BTreeMap;
use std::path::Path;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let mut db = DatabaseEngine::open(Path::new("./my-graph"), &DbOptions::default())?;

    // Create some nodes
    let mut props = BTreeMap::new();
    props.insert("name".into(), PropValue::String("Alice".into()));
    props.insert("role".into(), PropValue::String("engineer".into()));
    let alice = db.upsert_node(1, "user:alice", props, 1.0)?;

    let mut props = BTreeMap::new();
    props.insert("status".into(), PropValue::String("active".into()));
    let project = db.upsert_node(2, "project:overgraph", props, 0.9)?;

    // Connect them
    let _edge = db.upsert_edge(alice, project, 10, Default::default(), 1.0, None, None)?;

    // Traverse
    let neighbors = db.neighbors(alice, Direction::Outgoing, Some(&[10]), 50, None, None)?;
    println!("Alice is connected to {} nodes", neighbors.len());

    db.close()?;
    Ok(())
}

Node.js

import { OverGraph } from 'overgraph';

const db = await OverGraph.openAsync('./my-graph');

// Create nodes
const [alice, project] = await db.batchUpsertNodesAsync([
  { typeId: 1, key: 'user:alice', props: { name: 'Alice', role: 'engineer' }, weight: 1.0 },
  { typeId: 2, key: 'project:overgraph', props: { status: 'active' }, weight: 0.9 },
]);

// Connect them
await db.upsertEdgeAsync(alice, project, 10, { role: 'creator' }, 1.0);

// Traverse
const neighbors = await db.neighborsAsync(alice, 'outgoing', [10], 50);
console.log(`Alice is connected to ${neighbors.length} nodes`);

await db.closeAsync();

Python

from overgraph import OverGraph

with OverGraph.open("./my-graph") as db:
    # Create nodes
    alice = db.upsert_node(1, "user:alice",
        props={"name": "Alice", "role": "engineer"}, weight=1.0)
    project = db.upsert_node(2, "project:overgraph",
        props={"status": "active"}, weight=0.9)

    # Connect them
    db.upsert_edge(alice, project, 10, props={"role": "creator"}, weight=1.0)

    # Traverse
    neighbors = db.neighbors(alice, "outgoing", type_filter=[10], limit=50)
    print(f"Alice is connected to {len(neighbors)} nodes")

Or async:

from overgraph import AsyncOverGraph

async with await AsyncOverGraph.open("./my-graph") as db:
    alice = await db.upsert_node(1, "user:alice", props={"name": "Alice"})
    neighbors = await db.neighbors(alice, "outgoing")

Features

Core graph operations

  • Upsert semantics. Nodes are keyed by (type_id, key). Upsert the same key twice and you get an update, not a duplicate. Edges can optionally enforce uniqueness on (from, to, type_id).
  • Batch operations. batch_upsert_nodes and batch_upsert_edges amortize WAL and memtable overhead. There's also a packed binary format for maximum throughput.
  • Atomic graph patch. graph_patch lets you upsert nodes, upsert edges, delete nodes, delete edges, and invalidate edges in a single atomic operation.

Temporal edges

  • Validity windows. Edges have optional valid_from and valid_to timestamps. Query at any point in time with the at_epoch parameter and only see edges that were valid at that moment.
  • Edge invalidation. Mark an edge as no longer valid without deleting it. The history is preserved.
  • Decay scoring. Pass a decay_lambda to neighbor queries and edge weights are automatically scaled by exp(-lambda * age_hours). Recent connections matter more.

Queries and traversal

  • 1-hop and 2-hop neighbors. With optional edge type filters, direction control, and limit.
  • Constrained 2-hop. Traverse specific edge types in the first hop, then filter target nodes by type in the second hop. Built for "find all X related to Y through Z" patterns.
  • Top-K neighbors. Get the K highest-scoring neighbors by weight, recency, or decay-adjusted score.
  • Personalized PageRank. Run PPR from seed nodes to find the most relevant nodes in the graph. Useful for context retrieval in RAG pipelines.
  • Subgraph extraction. Pull out a connected subgraph up to N hops deep. Good for building local context windows.
  • Property search. Find nodes by type_id + property equality. Hash-indexed for O(1) lookup.
  • Time-range queries. Find nodes created or updated within a time window. Sorted timestamp index for efficient range scans.

Pagination

Every collection-returning API supports keyset pagination with limit and after parameters. Cursors are stable across concurrent writes. No offset-based pagination, no skipping, no missed records.

Retention and pruning

  • Manual prune. Drop nodes older than X, below weight Y, or matching type Z. Incident edges cascade automatically.
  • Named prune policies. Register policies like "short_term_memory" that run automatically during compaction. Nodes matching any policy are invisible to reads immediately (lazy expiration) and cleaned up during the next compaction pass.

Storage engine

  • Write-ahead log. Every mutation hits the WAL before the memtable. Crash recovery replays the WAL on startup.
  • Configurable durability. Immediate mode fsyncs every write for maximum safety. GroupCommit mode (default) batches fsyncs on a 10ms timer for ~20x better write throughput with at most one timer interval of data at risk.
  • Background compaction. Segments are merged automatically when thresholds are met. Compaction runs on a background thread and never blocks reads or writes. Uses metadata sidecars for fast filtered merging without full record decoding.
  • mmap'd reads. Immutable segments are memory-mapped. The OS page cache handles caching. Reads never block writes.
  • Portable databases. Each database is a self-contained directory. cp -r ./my-db /backup/my-db and you're done.

How it works

OverGraph uses a log-structured merge tree (LSM) storage engine, similar in spirit to how LevelDB and RocksDB work, but purpose-built from scratch in pure Rust.

Write path: Mutations are appended to a write-ahead log and applied to an in-memory table (the memtable). When the memtable gets large enough, it's frozen and flushed to disk as an immutable segment with pre-built indexes.

Read path: Queries check the memtable first (freshest data), then scan immutable segments from newest to oldest. A K-way merge combines results across all sources with early termination for paginated queries.

Compaction: A background thread periodically merges older segments together, applying tombstones, prune policies, and deduplication. The compaction path uses metadata sidecars to plan merges and raw-copies winning records without deserializing them, then rebuilds all indexes from metadata. This keeps compaction fast even on large datasets.

On-disk layout:

my-graph/
  manifest.current    # atomic checkpoint (JSON)
  data.wal            # append-only write-ahead log
  segments/
    seg_0001/
      nodes.dat       # node records
      edges.dat       # edge records
      adj_out.idx     # outgoing adjacency index
      adj_in.idx      # incoming adjacency index
      key_index.dat   # (type_id, key) -> node_id
      type_index.dat  # type_id -> [id...]
      tombstones.dat  # deleted IDs
    seg_0002/
      ...

For a deeper dive, see the architecture overview.

API reference

  • Rust: Run cargo doc --open to browse the full rustdoc. All public types and methods are documented.
  • Node.js: See the TypeScript declarations for the complete API surface. Every method has a sync and async variant.
  • Python: See the type stubs for the complete API surface. Both sync (OverGraph) and async (AsyncOverGraph) classes are available.

Running the benchmarks

# Rust
scripts/bench/run-rust.sh --profile small --warmup 20 --iters 80

# Node.js
scripts/bench/run-node.sh --profile small --warmup 20 --iters 80

# Python
scripts/bench/run-python.sh --profile small --warmup 20 --iters 80

Benchmark methodology, FAQ, and reproducibility instructions are in docs/04-quality/.

Building from source

# Rust core
cargo build --release
cargo test

# Node.js connector
cd overgraph-node
npm install
npm run build
npm test

# Python connector
cd overgraph-python
pip install maturin
maturin develop
pytest

Contributing

See CONTRIBUTING.md for build instructions, coding conventions, and how to submit a pull request.

License

Licensed under either of

at your option.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

overgraph-0.1.1.tar.gz (302.7 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

overgraph-0.1.1-cp39-abi3-win_amd64.whl (747.7 kB view details)

Uploaded CPython 3.9+Windows x86-64

overgraph-0.1.1-cp39-abi3-macosx_11_0_arm64.whl (696.9 kB view details)

Uploaded CPython 3.9+macOS 11.0+ ARM64

overgraph-0.1.1-cp39-abi3-macosx_10_12_x86_64.whl (782.8 kB view details)

Uploaded CPython 3.9+macOS 10.12+ x86-64

overgraph-0.1.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (795.5 kB view details)

Uploaded CPython 3.8manylinux: glibc 2.17+ x86-64

File details

Details for the file overgraph-0.1.1.tar.gz.

File metadata

  • Download URL: overgraph-0.1.1.tar.gz
  • Upload date:
  • Size: 302.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for overgraph-0.1.1.tar.gz
Algorithm Hash digest
SHA256 dbf0802d8e3c71c17b8718ce5951b7ff80cc69d5146309447fd94a8459cd802e
MD5 84f33e627d57b942ad1b5b616e84a5f0
BLAKE2b-256 2aec1947e6ab8981535c74392cdc48b3b9bcbc745e37369a242ddababb5f2ddf

See more details on using hashes here.

Provenance

The following attestation bundles were made for overgraph-0.1.1.tar.gz:

Publisher: release.yml on Bhensley5/overgraph

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file overgraph-0.1.1-cp39-abi3-win_amd64.whl.

File metadata

  • Download URL: overgraph-0.1.1-cp39-abi3-win_amd64.whl
  • Upload date:
  • Size: 747.7 kB
  • Tags: CPython 3.9+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for overgraph-0.1.1-cp39-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 ea92470d312ac4da53a7cdbc44cdfeacd71b178c7cec4a0672ab437024042bb9
MD5 8fb3ce11e0ed2ce9b9db8eb1826d5ecc
BLAKE2b-256 5f6e6f0ec971ed51254ab9618651c965d483b968db17be2965f29a75e958200c

See more details on using hashes here.

Provenance

The following attestation bundles were made for overgraph-0.1.1-cp39-abi3-win_amd64.whl:

Publisher: release.yml on Bhensley5/overgraph

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file overgraph-0.1.1-cp39-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for overgraph-0.1.1-cp39-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 86ff7af7de5963abc251f101dba11280da7cae04c93d73e8d3c3b0c00d99369a
MD5 3709823b7b4bdcfe0c6f284079059561
BLAKE2b-256 7bf1326a1f81a1857771d3a9745c4f105360c510d24f161b898025cfe996878d

See more details on using hashes here.

Provenance

The following attestation bundles were made for overgraph-0.1.1-cp39-abi3-macosx_11_0_arm64.whl:

Publisher: release.yml on Bhensley5/overgraph

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file overgraph-0.1.1-cp39-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for overgraph-0.1.1-cp39-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 36a9b9f05e2b6369f484348b83eb66a7b88f95484c82f2d2ac9cbb6402881dd9
MD5 9df9c9228f8d54e19d5fdee3549f9057
BLAKE2b-256 849c3ab6b9ce0d53292040d0db8b2ce78269d65ad9e2fc593e6c9809fe266308

See more details on using hashes here.

Provenance

The following attestation bundles were made for overgraph-0.1.1-cp39-abi3-macosx_10_12_x86_64.whl:

Publisher: release.yml on Bhensley5/overgraph

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file overgraph-0.1.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for overgraph-0.1.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 8bb8a4a6ebc1f350a18ecbf3956fed64c0b1f8e91d412e1fce5b6bf118bf7f38
MD5 2c8e689646ea1148aa2505f7cc0677a0
BLAKE2b-256 b5e348ec753efafa98eb8e77e3350ee3c1a51d0917a46224f4e59391605b76cb

See more details on using hashes here.

Provenance

The following attestation bundles were made for overgraph-0.1.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release.yml on Bhensley5/overgraph

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page