Skip to main content

High-performance Zero-Copy Graph Engine

Project description

GraphZero

High-Performance, Zero-Copy Graph Engine for Massive Datasets on Consumer Hardware.

GraphZero is a C++ graph processing engine with lightweight Python bindings designed to solve the "Memory Wall" in Graph Neural Networks (GNNs). It allows you to load and sample 100 Million+ node graphs (like ogbn-papers100M) on a standard 16GB RAM laptop—something standard libraries like PyTorch Geometric (PyG) or DGL cannot do.

⚡ The Problem

GNN datasets can be massive. ogbn-papers100M contains 111 Million nodes and 1.6 Billion edges.

  • Standard approach (PyG/NetworkX): Tries to load the entire graph structure into RAM.
  • The Result: MemoryError (OOM) on consumer hardware. You need 64GB+ RAM servers just to load the data.

🛠️ The Solution:

GraphZero Architecture

GraphZero abandons the "Load-to-RAM" model. Instead, it uses a custom Zero-Copy Architecture:

  • Memory Mapping (mmap): The graph stays on disk. The OS only loads the specific "hot" pages needed for computation into RAM.
  • Compressed CSR: A custom binary format (.gl) that compresses raw edges by ~60% (30GB CSV 13GB Binary).
  • Parallel Sampling: OpenMP-accelerated random walks that saturate NVMe SSD throughput.

🏆 Benchmarks: GraphZero vs. PyTorch Geometric

Task: Load ogbn-papers100M (56GB Raw) and perform random walks. Hardware: Windows Laptop (16GB RAM, NVMe SSD).

Metric GraphZero (v0.1) PyTorch Geometric
Load Time 0.000000 s FAILED (Crash) ❌
Peak RAM Usage ~5.1 GB (OS Cache) >24.1 GB (Required)
Throughput 1,264,000 steps/s N/A
Status Success OOM Error

Proof of Performance

Left: GraphZero loading instantly and utilizing OS Page Cache. Right: PyG crashing with Unable to allocate 24.1 GiB.


📦 Installation

GraphZero is available on PyPI (Pre-Alpha):

pip install graphzero

Requirements: Python 3.8+, C++17 Compiler (MSVC/GCC), OpenMP.


🚀 Quick Start

1. Convert Your Data

GraphZero uses a high-efficiency binary format (.gl). Convert your generic CSV edges list once.

import graphzero as gz

# Converts raw CSV (src, dst) to memory-mapped binary
# Handles 100M+ edges easily on minimal RAM
gz.convert_csv_to_gl(
    input_csv="dataset/edges.csv", 
    output_bin="graph.gl", 
    directed=True
)

2. High-Speed Sampling

Once converted, the graph is instantly accessible.

import graphzero as gz
import numpy as np

# 1. Zero-Copy Load (Instant)
g = gz.Graph("graph.gl")

# 2. Define Start Nodes (e.g., 1000 random nodes)
start_nodes = np.random.randint(0, g.num_nodes, 1000).astype(np.uint64)

# 3. Parallel Random Walk (node2vec / DeepWalk style)
# Returns: List of walks (flat or list-of-lists)
walks = g.batch_random_walk_uniform(
    start_nodes=start_nodes, 
    walk_length=10
)

print(f"Generated {len(walks)} steps instantly.")

⚙️ Under the Hood

GraphZero is built for Systems & GNN enthusiasts.

  • Core: C++20 with nanobind for Python bindings.
  • Parallelism: Uses #pragma omp with thread-local RNGs to prevent false sharing and lock contention.
  • IO: Direct CreateFileMapping (Windows) and mmap (Linux) calls with alignment optimization (4KB/2MB pages).

🗺️ Roadmap

  • v0.1 (Current): Topology-only support. Uniform Random Walks.
  • v0.2: Columnar Feature Store (mmap support for Node Features ).
  • v0.3: Weighted Edges & SIMD (AVX2) Neighbor Intersection.
  • v0.4: Dynamic Updates (LSM-Tree based mutable graphs).
  • v0.5: Pinned Memory Allocator for faster CPU GPU transfer.

📄 License

MIT License. Created by Krish Singaria (IIT Mandi).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graphzero-0.1.1.tar.gz (5.6 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

graphzero-0.1.1-cp312-cp312-win_amd64.whl (75.9 kB view details)

Uploaded CPython 3.12Windows x86-64

graphzero-0.1.1-cp312-cp312-win32.whl (68.9 kB view details)

Uploaded CPython 3.12Windows x86

graphzero-0.1.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (164.7 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

graphzero-0.1.1-cp312-cp312-macosx_11_0_arm64.whl (309.0 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

graphzero-0.1.1-cp311-cp311-win_amd64.whl (76.7 kB view details)

Uploaded CPython 3.11Windows x86-64

graphzero-0.1.1-cp311-cp311-win32.whl (69.5 kB view details)

Uploaded CPython 3.11Windows x86

graphzero-0.1.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (165.8 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

graphzero-0.1.1-cp311-cp311-macosx_11_0_arm64.whl (310.1 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

graphzero-0.1.1-cp310-cp310-win_amd64.whl (76.8 kB view details)

Uploaded CPython 3.10Windows x86-64

graphzero-0.1.1-cp310-cp310-win32.whl (69.7 kB view details)

Uploaded CPython 3.10Windows x86

graphzero-0.1.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (166.0 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

graphzero-0.1.1-cp310-cp310-macosx_11_0_arm64.whl (310.3 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

graphzero-0.1.1-cp39-cp39-win_amd64.whl (77.2 kB view details)

Uploaded CPython 3.9Windows x86-64

graphzero-0.1.1-cp39-cp39-win32.whl (70.7 kB view details)

Uploaded CPython 3.9Windows x86

graphzero-0.1.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (166.1 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

graphzero-0.1.1-cp39-cp39-macosx_11_0_arm64.whl (310.5 kB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

File details

Details for the file graphzero-0.1.1.tar.gz.

File metadata

  • Download URL: graphzero-0.1.1.tar.gz
  • Upload date:
  • Size: 5.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for graphzero-0.1.1.tar.gz
Algorithm Hash digest
SHA256 9e060d3e2ae58cc3abec9b665efc7bc82cf8dde893c4d76dc1ce5a3ce28f383a
MD5 183f169bc12546bac2156b18450cfe67
BLAKE2b-256 8775ae2bc25efceb88a59e006932bb58ce3bf12c71dae84c521f1cba39715515

See more details on using hashes here.

File details

Details for the file graphzero-0.1.1-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: graphzero-0.1.1-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 75.9 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for graphzero-0.1.1-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 d284b4cc321ae0e2885cce210d8195dadcded1ff60d13e704c2ac398be0afe5a
MD5 d63ce66ad66cd5c7a5270262d90efe93
BLAKE2b-256 991150dc3a02352c94d192d789171c5e4bddbca964fb64c4f348a4a4434f9c16

See more details on using hashes here.

File details

Details for the file graphzero-0.1.1-cp312-cp312-win32.whl.

File metadata

  • Download URL: graphzero-0.1.1-cp312-cp312-win32.whl
  • Upload date:
  • Size: 68.9 kB
  • Tags: CPython 3.12, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for graphzero-0.1.1-cp312-cp312-win32.whl
Algorithm Hash digest
SHA256 ee4b116540a9e68df72fecf9473234e247440646313fa22f846233ff8a4fc023
MD5 7be7f0d334d3b4f42d8013da4bd18c84
BLAKE2b-256 0f39a3f518d3483f19daf25932c48cde6c4bf745401cc477fcaee84013413014

See more details on using hashes here.

File details

Details for the file graphzero-0.1.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for graphzero-0.1.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 3495ced4071882c3c22f7d5f0fedcfcb4910dba0697545470ce2a07dffe09dfb
MD5 48dcbe9bffdba597f814707caff59b66
BLAKE2b-256 25fe85d798118724b746567174768600073473d67b3e547833dd3d6f9ab95ae4

See more details on using hashes here.

File details

Details for the file graphzero-0.1.1-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for graphzero-0.1.1-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 cc672a355826658064a1b1dbac04feb14c1c3290482900c729b0d65e125368ab
MD5 ff34bac64d35c92d2d30cdd7e82165ec
BLAKE2b-256 6404f88a53f0f15a6d00eb22eb4748064ff5d0cc0ab67c02feca9ca06bbdcbaa

See more details on using hashes here.

File details

Details for the file graphzero-0.1.1-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: graphzero-0.1.1-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 76.7 kB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for graphzero-0.1.1-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 c30f2baba0401e948f25a2373b2173c4f91f1a109a70b1dbdfdfd10ed7f5cfe4
MD5 5791fa2672101812d872f89f210c8695
BLAKE2b-256 82d780ab80ab74b7e43075403672d881a66e95a286bd1164c8c39c87508c61f2

See more details on using hashes here.

File details

Details for the file graphzero-0.1.1-cp311-cp311-win32.whl.

File metadata

  • Download URL: graphzero-0.1.1-cp311-cp311-win32.whl
  • Upload date:
  • Size: 69.5 kB
  • Tags: CPython 3.11, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for graphzero-0.1.1-cp311-cp311-win32.whl
Algorithm Hash digest
SHA256 1e9c3c946239b341c553f14b54ac5df40703fb2abc283d848665309be2c8edca
MD5 d72a6108fe9bf3c67e6420139739fb47
BLAKE2b-256 764ccf2084940faafc75a6ab3c06c51d4be266e135f1be75ac74913c8aee18db

See more details on using hashes here.

File details

Details for the file graphzero-0.1.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for graphzero-0.1.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f72e4db47b99886e4fb32d1a1e5d18b929e924e1c8a99c921c64691d012479d4
MD5 ab2cf91356b6d2c839049935c00cf337
BLAKE2b-256 16eb65291b7e33548003356dc4cbe683213678b136317907a4b5b7cc36aaa31d

See more details on using hashes here.

File details

Details for the file graphzero-0.1.1-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for graphzero-0.1.1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 42f676f3bde4a6009aa53a76ec0b3a6a657dc421c60e960cc18f9782f2133789
MD5 81fb256770ec4ce5533dc48ad3a1039a
BLAKE2b-256 4432de22899ef9c79dbfbc84fdef80b37b2dfdf228d7962500eda8df49493b7f

See more details on using hashes here.

File details

Details for the file graphzero-0.1.1-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: graphzero-0.1.1-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 76.8 kB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for graphzero-0.1.1-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 b2452e8b4691178b3718dede8ebd75fb11e4fdd9430a58390781e4c4c07bf441
MD5 8db6117d165863d72a225a0288efb070
BLAKE2b-256 3dceab2f44b4165c6490b3aa1fe42998dff9a10b6702b02e54309c56157eb336

See more details on using hashes here.

File details

Details for the file graphzero-0.1.1-cp310-cp310-win32.whl.

File metadata

  • Download URL: graphzero-0.1.1-cp310-cp310-win32.whl
  • Upload date:
  • Size: 69.7 kB
  • Tags: CPython 3.10, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for graphzero-0.1.1-cp310-cp310-win32.whl
Algorithm Hash digest
SHA256 e432b21dc3565bb0b6692f1f5d99403f18a94ad0ff9a754d61b3939c9fe69b49
MD5 2cfcb62b91bac83813e2aed219002da9
BLAKE2b-256 d30d56e11db69ec0ba025429ae0f34cb41cee274b76fe6d753030fc4a7952c56

See more details on using hashes here.

File details

Details for the file graphzero-0.1.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for graphzero-0.1.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4799725ef105b21b6bb75692199d3ababe7867af7ae2105d8b8520e0d6d5b1ad
MD5 84dc93aaba8e533d8ad13e590f50f6c6
BLAKE2b-256 234d959ea06d73a0b76e0122625e4258276cf2ad05368d18960d38aae82241c0

See more details on using hashes here.

File details

Details for the file graphzero-0.1.1-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for graphzero-0.1.1-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 528538de31ac4c07c51c472e1c317c1ac0eb67c35a1e937b89ae350c84da303c
MD5 c59258e98e2ed8600da17ed241b6267f
BLAKE2b-256 45773498f228a7f2c78935fa1d25ba205df389be5bf1adf0397e335279491fc3

See more details on using hashes here.

File details

Details for the file graphzero-0.1.1-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: graphzero-0.1.1-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 77.2 kB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for graphzero-0.1.1-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 1c5c4aff053570413fb1b796b696d7833794339e7bded61839cf418d4a4a6dd8
MD5 834c790a3cb77ab1125aa973ca4c250d
BLAKE2b-256 dafca63a611457a96926d44e73c82283e8b766bfdd7c65827552c165381b0c8e

See more details on using hashes here.

File details

Details for the file graphzero-0.1.1-cp39-cp39-win32.whl.

File metadata

  • Download URL: graphzero-0.1.1-cp39-cp39-win32.whl
  • Upload date:
  • Size: 70.7 kB
  • Tags: CPython 3.9, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for graphzero-0.1.1-cp39-cp39-win32.whl
Algorithm Hash digest
SHA256 63adcca05d42e0bce013ebdf72082fc59310fbcab27672430d731a94e5fad7e2
MD5 f4847e1fce580d9f733696e905785237
BLAKE2b-256 734f1dd26a7fdd3046945d1d30926c7180f728d7f368b59cc3d1c77e5e18d709

See more details on using hashes here.

File details

Details for the file graphzero-0.1.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for graphzero-0.1.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c5e5a7d52e77900ec817010406355111d4635d6df48723b6e70435eb773805a2
MD5 464a4c6cf15ddeb979ed1e0c409901ac
BLAKE2b-256 284c7b712276ab74975baee3d6d32e877a7ba42cee5e03120b32b43df634b00d

See more details on using hashes here.

File details

Details for the file graphzero-0.1.1-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for graphzero-0.1.1-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 24aedc2ecf42c22a9c0ccb7a10bed04d2dc012107d960261225b6fe0f1d797af
MD5 9a5752383cb8937221841be0fe7f5493
BLAKE2b-256 4c87705bf2a093727250a1211197f7e58ef16f16a4f9a609b221abbc07f67b23

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page