High-performance Zero-Copy Graph Engine
Project description
GraphZero
High-Performance, Zero-Copy Graph Engine for Massive Datasets on Consumer Hardware.
GraphZero is a C++ graph processing engine with lightweight Python bindings designed to solve the "Memory Wall" in Graph Neural Networks (GNNs). It allows you to load and sample 100 Million+ node graphs (like ogbn-papers100M) on a standard 16GB RAM laptop—something standard libraries like PyTorch Geometric (PyG) or DGL cannot do.
⚡ The Problem
GNN datasets can be massive. ogbn-papers100M contains 111 Million nodes and 1.6 Billion edges.
- Standard approach (PyG/NetworkX): Tries to load the entire graph structure into RAM.
- The Result:
MemoryError(OOM) on consumer hardware. You need 64GB+ RAM servers just to load the data.
🛠️ The Solution:
GraphZero abandons the "Load-to-RAM" model. Instead, it uses a custom Zero-Copy Architecture:
- Memory Mapping (
mmap): The graph stays on disk. The OS only loads the specific "hot" pages needed for computation into RAM. - Compressed CSR: A custom binary format (
.gl) that compresses raw edges by ~60% (30GB CSV 13GB Binary). - Parallel Sampling: OpenMP-accelerated random walks that saturate NVMe SSD throughput.
🏆 Benchmarks: GraphZero vs. PyTorch Geometric
Task: Load ogbn-papers100M (56GB Raw) and perform random walks.
Hardware: Windows Laptop (16GB RAM, NVMe SSD).
| Metric | GraphZero (v0.1) | PyTorch Geometric |
|---|---|---|
| Load Time | 0.000000 s ⚡ | FAILED (Crash) ❌ |
| Peak RAM Usage | ~5.1 GB (OS Cache) | >24.1 GB (Required) |
| Throughput | 1,264,000 steps/s | N/A |
| Status | ✅ Success | ❌ OOM Error |
Proof of Performance
Left: GraphZero loading instantly and utilizing OS Page Cache. Right: PyG crashing with
Unable to allocate 24.1 GiB.
📦 Installation
GraphZero is available on PyPI (Pre-Alpha):
pip install graphzero
Requirements: Python 3.8+, C++17 Compiler (MSVC/GCC), OpenMP.
🚀 Quick Start
1. Convert Your Data
GraphZero uses a high-efficiency binary format (.gl). Convert your generic CSV edges list once.
import graphzero as gz
# Converts raw CSV (src, dst) to memory-mapped binary
# Handles 100M+ edges easily on minimal RAM
gz.convert_csv_to_gl(
input_csv="dataset/edges.csv",
output_bin="graph.gl",
directed=True
)
2. High-Speed Sampling
Once converted, the graph is instantly accessible.
import graphzero as gz
import numpy as np
# 1. Zero-Copy Load (Instant)
g = gz.Graph("graph.gl")
# 2. Define Start Nodes (e.g., 1000 random nodes)
start_nodes = np.random.randint(0, g.num_nodes, 1000).astype(np.uint64)
# 3. Parallel Random Walk (node2vec / DeepWalk style)
# Returns: List of walks (flat or list-of-lists)
walks = g.batch_random_walk_uniform(
start_nodes=start_nodes,
walk_length=10
)
print(f"Generated {len(walks)} steps instantly.")
⚙️ Under the Hood
GraphZero is built for Systems & GNN enthusiasts.
- Core: C++20 with
nanobindfor Python bindings. - Parallelism: Uses
#pragma ompwith thread-local RNGs to prevent false sharing and lock contention. - IO: Direct
CreateFileMapping(Windows) andmmap(Linux) calls with alignment optimization (4KB/2MB pages).
🗺️ Roadmap
- v0.1 (Current): Topology-only support. Uniform Random Walks.
- v0.2: Columnar Feature Store (mmap support for Node Features ).
- v0.3: Weighted Edges & SIMD (AVX2) Neighbor Intersection.
- v0.4: Dynamic Updates (LSM-Tree based mutable graphs).
- v0.5: Pinned Memory Allocator for faster CPU GPU transfer.
📄 License
MIT License. Created by Krish Singaria (IIT Mandi).
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file graphzero-0.1.1.tar.gz.
File metadata
- Download URL: graphzero-0.1.1.tar.gz
- Upload date:
- Size: 5.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9e060d3e2ae58cc3abec9b665efc7bc82cf8dde893c4d76dc1ce5a3ce28f383a
|
|
| MD5 |
183f169bc12546bac2156b18450cfe67
|
|
| BLAKE2b-256 |
8775ae2bc25efceb88a59e006932bb58ce3bf12c71dae84c521f1cba39715515
|
File details
Details for the file graphzero-0.1.1-cp312-cp312-win_amd64.whl.
File metadata
- Download URL: graphzero-0.1.1-cp312-cp312-win_amd64.whl
- Upload date:
- Size: 75.9 kB
- Tags: CPython 3.12, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d284b4cc321ae0e2885cce210d8195dadcded1ff60d13e704c2ac398be0afe5a
|
|
| MD5 |
d63ce66ad66cd5c7a5270262d90efe93
|
|
| BLAKE2b-256 |
991150dc3a02352c94d192d789171c5e4bddbca964fb64c4f348a4a4434f9c16
|
File details
Details for the file graphzero-0.1.1-cp312-cp312-win32.whl.
File metadata
- Download URL: graphzero-0.1.1-cp312-cp312-win32.whl
- Upload date:
- Size: 68.9 kB
- Tags: CPython 3.12, Windows x86
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ee4b116540a9e68df72fecf9473234e247440646313fa22f846233ff8a4fc023
|
|
| MD5 |
7be7f0d334d3b4f42d8013da4bd18c84
|
|
| BLAKE2b-256 |
0f39a3f518d3483f19daf25932c48cde6c4bf745401cc477fcaee84013413014
|
File details
Details for the file graphzero-0.1.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: graphzero-0.1.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 164.7 kB
- Tags: CPython 3.12, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3495ced4071882c3c22f7d5f0fedcfcb4910dba0697545470ce2a07dffe09dfb
|
|
| MD5 |
48dcbe9bffdba597f814707caff59b66
|
|
| BLAKE2b-256 |
25fe85d798118724b746567174768600073473d67b3e547833dd3d6f9ab95ae4
|
File details
Details for the file graphzero-0.1.1-cp312-cp312-macosx_11_0_arm64.whl.
File metadata
- Download URL: graphzero-0.1.1-cp312-cp312-macosx_11_0_arm64.whl
- Upload date:
- Size: 309.0 kB
- Tags: CPython 3.12, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cc672a355826658064a1b1dbac04feb14c1c3290482900c729b0d65e125368ab
|
|
| MD5 |
ff34bac64d35c92d2d30cdd7e82165ec
|
|
| BLAKE2b-256 |
6404f88a53f0f15a6d00eb22eb4748064ff5d0cc0ab67c02feca9ca06bbdcbaa
|
File details
Details for the file graphzero-0.1.1-cp311-cp311-win_amd64.whl.
File metadata
- Download URL: graphzero-0.1.1-cp311-cp311-win_amd64.whl
- Upload date:
- Size: 76.7 kB
- Tags: CPython 3.11, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c30f2baba0401e948f25a2373b2173c4f91f1a109a70b1dbdfdfd10ed7f5cfe4
|
|
| MD5 |
5791fa2672101812d872f89f210c8695
|
|
| BLAKE2b-256 |
82d780ab80ab74b7e43075403672d881a66e95a286bd1164c8c39c87508c61f2
|
File details
Details for the file graphzero-0.1.1-cp311-cp311-win32.whl.
File metadata
- Download URL: graphzero-0.1.1-cp311-cp311-win32.whl
- Upload date:
- Size: 69.5 kB
- Tags: CPython 3.11, Windows x86
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1e9c3c946239b341c553f14b54ac5df40703fb2abc283d848665309be2c8edca
|
|
| MD5 |
d72a6108fe9bf3c67e6420139739fb47
|
|
| BLAKE2b-256 |
764ccf2084940faafc75a6ab3c06c51d4be266e135f1be75ac74913c8aee18db
|
File details
Details for the file graphzero-0.1.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: graphzero-0.1.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 165.8 kB
- Tags: CPython 3.11, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f72e4db47b99886e4fb32d1a1e5d18b929e924e1c8a99c921c64691d012479d4
|
|
| MD5 |
ab2cf91356b6d2c839049935c00cf337
|
|
| BLAKE2b-256 |
16eb65291b7e33548003356dc4cbe683213678b136317907a4b5b7cc36aaa31d
|
File details
Details for the file graphzero-0.1.1-cp311-cp311-macosx_11_0_arm64.whl.
File metadata
- Download URL: graphzero-0.1.1-cp311-cp311-macosx_11_0_arm64.whl
- Upload date:
- Size: 310.1 kB
- Tags: CPython 3.11, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
42f676f3bde4a6009aa53a76ec0b3a6a657dc421c60e960cc18f9782f2133789
|
|
| MD5 |
81fb256770ec4ce5533dc48ad3a1039a
|
|
| BLAKE2b-256 |
4432de22899ef9c79dbfbc84fdef80b37b2dfdf228d7962500eda8df49493b7f
|
File details
Details for the file graphzero-0.1.1-cp310-cp310-win_amd64.whl.
File metadata
- Download URL: graphzero-0.1.1-cp310-cp310-win_amd64.whl
- Upload date:
- Size: 76.8 kB
- Tags: CPython 3.10, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b2452e8b4691178b3718dede8ebd75fb11e4fdd9430a58390781e4c4c07bf441
|
|
| MD5 |
8db6117d165863d72a225a0288efb070
|
|
| BLAKE2b-256 |
3dceab2f44b4165c6490b3aa1fe42998dff9a10b6702b02e54309c56157eb336
|
File details
Details for the file graphzero-0.1.1-cp310-cp310-win32.whl.
File metadata
- Download URL: graphzero-0.1.1-cp310-cp310-win32.whl
- Upload date:
- Size: 69.7 kB
- Tags: CPython 3.10, Windows x86
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e432b21dc3565bb0b6692f1f5d99403f18a94ad0ff9a754d61b3939c9fe69b49
|
|
| MD5 |
2cfcb62b91bac83813e2aed219002da9
|
|
| BLAKE2b-256 |
d30d56e11db69ec0ba025429ae0f34cb41cee274b76fe6d753030fc4a7952c56
|
File details
Details for the file graphzero-0.1.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: graphzero-0.1.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 166.0 kB
- Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4799725ef105b21b6bb75692199d3ababe7867af7ae2105d8b8520e0d6d5b1ad
|
|
| MD5 |
84dc93aaba8e533d8ad13e590f50f6c6
|
|
| BLAKE2b-256 |
234d959ea06d73a0b76e0122625e4258276cf2ad05368d18960d38aae82241c0
|
File details
Details for the file graphzero-0.1.1-cp310-cp310-macosx_11_0_arm64.whl.
File metadata
- Download URL: graphzero-0.1.1-cp310-cp310-macosx_11_0_arm64.whl
- Upload date:
- Size: 310.3 kB
- Tags: CPython 3.10, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
528538de31ac4c07c51c472e1c317c1ac0eb67c35a1e937b89ae350c84da303c
|
|
| MD5 |
c59258e98e2ed8600da17ed241b6267f
|
|
| BLAKE2b-256 |
45773498f228a7f2c78935fa1d25ba205df389be5bf1adf0397e335279491fc3
|
File details
Details for the file graphzero-0.1.1-cp39-cp39-win_amd64.whl.
File metadata
- Download URL: graphzero-0.1.1-cp39-cp39-win_amd64.whl
- Upload date:
- Size: 77.2 kB
- Tags: CPython 3.9, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1c5c4aff053570413fb1b796b696d7833794339e7bded61839cf418d4a4a6dd8
|
|
| MD5 |
834c790a3cb77ab1125aa973ca4c250d
|
|
| BLAKE2b-256 |
dafca63a611457a96926d44e73c82283e8b766bfdd7c65827552c165381b0c8e
|
File details
Details for the file graphzero-0.1.1-cp39-cp39-win32.whl.
File metadata
- Download URL: graphzero-0.1.1-cp39-cp39-win32.whl
- Upload date:
- Size: 70.7 kB
- Tags: CPython 3.9, Windows x86
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
63adcca05d42e0bce013ebdf72082fc59310fbcab27672430d731a94e5fad7e2
|
|
| MD5 |
f4847e1fce580d9f733696e905785237
|
|
| BLAKE2b-256 |
734f1dd26a7fdd3046945d1d30926c7180f728d7f368b59cc3d1c77e5e18d709
|
File details
Details for the file graphzero-0.1.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: graphzero-0.1.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 166.1 kB
- Tags: CPython 3.9, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c5e5a7d52e77900ec817010406355111d4635d6df48723b6e70435eb773805a2
|
|
| MD5 |
464a4c6cf15ddeb979ed1e0c409901ac
|
|
| BLAKE2b-256 |
284c7b712276ab74975baee3d6d32e877a7ba42cee5e03120b32b43df634b00d
|
File details
Details for the file graphzero-0.1.1-cp39-cp39-macosx_11_0_arm64.whl.
File metadata
- Download URL: graphzero-0.1.1-cp39-cp39-macosx_11_0_arm64.whl
- Upload date:
- Size: 310.5 kB
- Tags: CPython 3.9, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
24aedc2ecf42c22a9c0ccb7a10bed04d2dc012107d960261225b6fe0f1d797af
|
|
| MD5 |
9a5752383cb8937221841be0fe7f5493
|
|
| BLAKE2b-256 |
4c87705bf2a093727250a1211197f7e58ef16f16a4f9a609b221abbc07f67b23
|