Skip to main content

High-performance Zero-Copy Graph Engine

Project description

GraphZero

High-Performance, Zero-Copy Graph Engine for Massive Datasets on Consumer Hardware.

GraphZero is a C++ graph processing engine with lightweight Python bindings designed to solve the "Memory Wall" in Graph Neural Networks (GNNs). It allows you to load and sample 100 Million+ node graphs (like ogbn-papers100M) on a standard 16GB RAM laptop—something standard libraries like PyTorch Geometric (PyG) or DGL cannot do.

⚡ The Problem

GNN datasets can be massive. ogbn-papers100M contains 111 Million nodes and 1.6 Billion edges.

  • Standard approach (PyG/NetworkX): Tries to load the entire graph structure into RAM.
  • The Result: MemoryError (OOM) on consumer hardware. You need 64GB+ RAM servers just to load the data.

🛠️ The Solution:

GraphZero Architecture

GraphZero abandons the "Load-to-RAM" model. Instead, it uses a custom Zero-Copy Architecture:

  • Memory Mapping (mmap): The graph stays on disk. The OS only loads the specific "hot" pages needed for computation into RAM.
  • Compressed CSR: A custom binary format (.gl) that compresses raw edges by ~60% (30GB CSV 13GB Binary).
  • Parallel Sampling: OpenMP-accelerated random walks that saturate NVMe SSD throughput.

🏆 Benchmarks: GraphZero vs. PyTorch Geometric

Task: Load ogbn-papers100M (56GB Raw) and perform random walks. Hardware: Windows Laptop (16GB RAM, NVMe SSD).

Metric GraphZero (v0.1) PyTorch Geometric
Load Time 0.000000 s FAILED (Crash) ❌
Peak RAM Usage ~5.1 GB (OS Cache) >24.1 GB (Required)
Throughput 1,264,000 steps/s N/A
Status Success OOM Error

Proof of Performance

Left: GraphZero loading instantly and utilizing OS Page Cache. Right: PyG crashing with Unable to allocate 24.1 GiB.


📦 Installation

GraphZero is available on PyPI (Pre-Alpha):

pip install graphzero

Requirements: Python 3.8+, C++17 Compiler (MSVC/GCC), OpenMP.


🚀 Quick Start

1. Convert Your Data

GraphZero uses a high-efficiency binary format (.gl). Convert your generic CSV edges list once.

import graphzero as gz

# Converts raw CSV (src, dst) to memory-mapped binary
# Handles 100M+ edges easily on minimal RAM
gz.convert_csv_to_gl(
    input_csv="dataset/edges.csv", 
    output_bin="graph.gl", 
    directed=True
)

2. High-Speed Sampling

Once converted, the graph is instantly accessible.

import graphzero as gz
import numpy as np

# 1. Zero-Copy Load (Instant)
g = gz.Graph("graph.gl")

# 2. Define Start Nodes (e.g., 1000 random nodes)
start_nodes = np.random.randint(0, g.num_nodes, 1000).astype(np.uint64)

# 3. Parallel Random Walk (node2vec / DeepWalk style)
# Returns: List of walks (flat or list-of-lists)
walks = g.batch_random_walk_uniform(
    start_nodes=start_nodes, 
    walk_length=10
)

print(f"Generated {len(walks)} steps instantly.")

⚙️ Under the Hood

GraphZero is built for Systems & GNN enthusiasts.

  • Core: C++20 with nanobind for Python bindings.
  • Parallelism: Uses #pragma omp with thread-local RNGs to prevent false sharing and lock contention.
  • IO: Direct CreateFileMapping (Windows) and mmap (Linux) calls with alignment optimization (4KB/2MB pages).

🗺️ Roadmap

  • v0.1 (Current): Topology-only support. Uniform Random Walks.
  • v0.2: Columnar Feature Store (mmap support for Node Features ).
  • v0.3: Weighted Edges & SIMD (AVX2) Neighbor Intersection.
  • v0.4: Dynamic Updates (LSM-Tree based mutable graphs).
  • v0.5: Pinned Memory Allocator for faster CPU GPU transfer.

📄 License

MIT License. Created by Krish Singaria (IIT Mandi).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graphzero-0.1.2.tar.gz (5.6 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

graphzero-0.1.2-cp312-cp312-win_amd64.whl (76.0 kB view details)

Uploaded CPython 3.12Windows x86-64

graphzero-0.1.2-cp312-cp312-win32.whl (69.1 kB view details)

Uploaded CPython 3.12Windows x86

graphzero-0.1.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (164.9 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

graphzero-0.1.2-cp312-cp312-macosx_11_0_arm64.whl (309.2 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

graphzero-0.1.2-cp311-cp311-win_amd64.whl (76.7 kB view details)

Uploaded CPython 3.11Windows x86-64

graphzero-0.1.2-cp311-cp311-win32.whl (69.7 kB view details)

Uploaded CPython 3.11Windows x86

graphzero-0.1.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (165.9 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

graphzero-0.1.2-cp311-cp311-macosx_11_0_arm64.whl (310.2 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

graphzero-0.1.2-cp310-cp310-win_amd64.whl (76.9 kB view details)

Uploaded CPython 3.10Windows x86-64

graphzero-0.1.2-cp310-cp310-win32.whl (69.9 kB view details)

Uploaded CPython 3.10Windows x86

graphzero-0.1.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (166.1 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

graphzero-0.1.2-cp310-cp310-macosx_11_0_arm64.whl (310.5 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

graphzero-0.1.2-cp39-cp39-win_amd64.whl (77.3 kB view details)

Uploaded CPython 3.9Windows x86-64

graphzero-0.1.2-cp39-cp39-win32.whl (70.9 kB view details)

Uploaded CPython 3.9Windows x86

graphzero-0.1.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (166.2 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

graphzero-0.1.2-cp39-cp39-macosx_11_0_arm64.whl (310.6 kB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

File details

Details for the file graphzero-0.1.2.tar.gz.

File metadata

  • Download URL: graphzero-0.1.2.tar.gz
  • Upload date:
  • Size: 5.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for graphzero-0.1.2.tar.gz
Algorithm Hash digest
SHA256 d87b9da4434b730aa4cc66fc2232d970a11716713609c83612cfd3b1583b35cd
MD5 c74bf605010c6e88b68214ce0f540946
BLAKE2b-256 46cefc5e165b151de30c378a6e6eaf7431e50bd0d37e5852ee1871ee0c63298f

See more details on using hashes here.

File details

Details for the file graphzero-0.1.2-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: graphzero-0.1.2-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 76.0 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for graphzero-0.1.2-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 0151c582d512f6619d4fe115b8fc875341283e1bd7cdde8cffa2bae6211982fd
MD5 6ddc9a39f18c51ded910a7e4a6b4a8c7
BLAKE2b-256 d707eb640f7b89db2ccec356e8dedb36fb8a78bea46010a1641aceac49eac68b

See more details on using hashes here.

File details

Details for the file graphzero-0.1.2-cp312-cp312-win32.whl.

File metadata

  • Download URL: graphzero-0.1.2-cp312-cp312-win32.whl
  • Upload date:
  • Size: 69.1 kB
  • Tags: CPython 3.12, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for graphzero-0.1.2-cp312-cp312-win32.whl
Algorithm Hash digest
SHA256 9a13baa18a0c9b26b49372f9ddc81745a2dd861143fe12b387bafddbc412cdeb
MD5 7e948afce2648b7e07427ace412a7f15
BLAKE2b-256 076ad51a0b4a5ffe4ed3ba0a493a7f79259bfeea8969ba1c0a2d92285b309135

See more details on using hashes here.

File details

Details for the file graphzero-0.1.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for graphzero-0.1.2-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 24a9fad5ebd6401171802f00ac70b1429055ab7db69dcf4bb1bc19bad156de04
MD5 d9654c3f24b0908420b4897fcd17fc4b
BLAKE2b-256 b4af5a235774aad88b0e18358c6e3316a07e06bc37a8f41700e191d4fb96624f

See more details on using hashes here.

File details

Details for the file graphzero-0.1.2-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for graphzero-0.1.2-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4459f254aa31a6010b03b726981b3eb6c78eecd5aa0e482eb0ad9d4cc7688f42
MD5 c49e446e62b4936ded228e1a0ca4ecc6
BLAKE2b-256 529104027d2b6daaf0af204e229007fd9b1e58d279b3b428b8575f748b8c7ec6

See more details on using hashes here.

File details

Details for the file graphzero-0.1.2-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: graphzero-0.1.2-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 76.7 kB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for graphzero-0.1.2-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 b102631333d945f33b121e5cbd492858a8ea7d0d1feeee108f68c019df145bf7
MD5 accdedded1a8cf5013337c38bec4f632
BLAKE2b-256 f1adb6775bf05bd2ee9159fc0bc1708d4ab5a4741335e4739a1e394a661c26b2

See more details on using hashes here.

File details

Details for the file graphzero-0.1.2-cp311-cp311-win32.whl.

File metadata

  • Download URL: graphzero-0.1.2-cp311-cp311-win32.whl
  • Upload date:
  • Size: 69.7 kB
  • Tags: CPython 3.11, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for graphzero-0.1.2-cp311-cp311-win32.whl
Algorithm Hash digest
SHA256 f9f5a7e97b23ec709ecbec1e927943d04bde6afa346ce59a35ecf6eede563216
MD5 be95d5e3348613728a533c700fa31f40
BLAKE2b-256 334f47572a4617496f2f751683896b41ae15e1523e14f319bb80edda15416917

See more details on using hashes here.

File details

Details for the file graphzero-0.1.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for graphzero-0.1.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b89a94975c086da22cd42a3ae2a989c60bd9009d5684de9064207ac653e4cf24
MD5 779bdabe11cf91195e81e963bd7e41da
BLAKE2b-256 4f895772f910ad28778e0be20db97932746a205a0c1d8ec891313c820910ac5e

See more details on using hashes here.

File details

Details for the file graphzero-0.1.2-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for graphzero-0.1.2-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 59278624d695ae308ab86a996da56935f25212138d901f86b87ef484738cb2e6
MD5 5283b7fffb7d4d5bd271f531626565fe
BLAKE2b-256 7346b9231f19f9d95e04a8796171506c4259a26acc2ee394ab49c019a3a2af06

See more details on using hashes here.

File details

Details for the file graphzero-0.1.2-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: graphzero-0.1.2-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 76.9 kB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for graphzero-0.1.2-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 630338d3c8b3c257122bc0f02a6abb7dec001cdbb5ee9363d8e69d283bca95ed
MD5 6f99dac0dcebe9e8fc0500d09d747a1f
BLAKE2b-256 35b687ebdd152b2790969486df4f647faf0d2086bab4bf43ae43bade385b7cef

See more details on using hashes here.

File details

Details for the file graphzero-0.1.2-cp310-cp310-win32.whl.

File metadata

  • Download URL: graphzero-0.1.2-cp310-cp310-win32.whl
  • Upload date:
  • Size: 69.9 kB
  • Tags: CPython 3.10, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for graphzero-0.1.2-cp310-cp310-win32.whl
Algorithm Hash digest
SHA256 6a9da5dc8161e4e7624434565723270046535a9c55fc6935eb6be28d1749049d
MD5 b8677b885c97090f8f55362c1f1d5ea8
BLAKE2b-256 193cbf83191fc8379fe70d3d79fc2ef7a99e96e51d3c8dccd89c54db680bea09

See more details on using hashes here.

File details

Details for the file graphzero-0.1.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for graphzero-0.1.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b5af8ec6988da0c7d319540275b2dddd967081137cbbd5ea3a214d705440df09
MD5 f887cf3657b62df825bbb95b9ef47b98
BLAKE2b-256 a003a9c0c38399b549ad177322ee2545d3de6574bf753d3e414ceee5e501a226

See more details on using hashes here.

File details

Details for the file graphzero-0.1.2-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for graphzero-0.1.2-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 05b5788a8a978c1ec39ad69aa6dc8daf20085fbae9313c2701246c05e0ece6b3
MD5 cb146cafe456b4d7ee4efed5c6f96ede
BLAKE2b-256 d672f19c3acb2c01efbba345cc0404f2a43c0d5f278cec24ed02182f679f17ee

See more details on using hashes here.

File details

Details for the file graphzero-0.1.2-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: graphzero-0.1.2-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 77.3 kB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for graphzero-0.1.2-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 54c3bd56b11c807c7f0470a83e8b8d3add48b41a5379f1d2b23459faae774f58
MD5 0b802f8e49cef9821e9928a6fc0d9e90
BLAKE2b-256 e43fb09ddb5ae700f7a4b136ab45fe2361788a14b1b0b98bd2a8400bb1a7d501

See more details on using hashes here.

File details

Details for the file graphzero-0.1.2-cp39-cp39-win32.whl.

File metadata

  • Download URL: graphzero-0.1.2-cp39-cp39-win32.whl
  • Upload date:
  • Size: 70.9 kB
  • Tags: CPython 3.9, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for graphzero-0.1.2-cp39-cp39-win32.whl
Algorithm Hash digest
SHA256 1e2f84d80ea0edb5f0c7cf93b2d0adeaf6167c0517203480ea38e9f68ee4ffa6
MD5 b4f94d05e8dc9fb7db273ed0d53bd571
BLAKE2b-256 6d5d8e6d87c6635f704c3e257dd44af38f834ebe350c1baa1b873c965168c292

See more details on using hashes here.

File details

Details for the file graphzero-0.1.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for graphzero-0.1.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 dbbc36571ff014627d2b1c1b64358ef5bc7bb1380b7d463b83bd0f981790aba9
MD5 09733c13f49e78312f4f56aa9277cb1d
BLAKE2b-256 fbc8676ef0403e85a047cc68fd1995bd1733b28ee83edd00c53afdbf1e158449

See more details on using hashes here.

File details

Details for the file graphzero-0.1.2-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for graphzero-0.1.2-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 0bf75b9ede3b2ab105bc9d85c2aec469892fbdd37323f30e99133c740f27e400
MD5 b7a721937ad72283cd8b3423add21365
BLAKE2b-256 26049d4d06cd33d32a54c77e2db3f39812b844085e73475e4b1d4d2125f717bd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page