Skip to main content

High-performance Zero-Copy Graph Engine

Reason this release was yanked:

broken, it will not work

Project description

GraphZero

High-Performance, Zero-Copy Graph Engine for Massive Datasets on Consumer Hardware.

GraphZero is a C++ graph processing engine with lightweight Python bindings designed to solve the "Memory Wall" in Graph Neural Networks (GNNs). It allows you to load and sample 100 Million+ node graphs (like ogbn-papers100M) on a standard 16GB RAM laptop—something standard libraries like PyTorch Geometric (PyG) or DGL cannot do.

⚡ The Problem

GNN datasets can be massive. ogbn-papers100M contains 111 Million nodes and 1.6 Billion edges.

  • Standard approach (PyG/NetworkX): Tries to load the entire graph structure into RAM.
  • The Result: MemoryError (OOM) on consumer hardware. You need 64GB+ RAM servers just to load the data.

🛠️ The Solution:

GraphZero Architecture

GraphZero abandons the "Load-to-RAM" model. Instead, it uses a custom Zero-Copy Architecture:

  • Memory Mapping (mmap): The graph stays on disk. The OS only loads the specific "hot" pages needed for computation into RAM.
  • Compressed CSR: A custom binary format (.gl) that compresses raw edges by ~60% (30GB CSV 13GB Binary).
  • Parallel Sampling: OpenMP-accelerated random walks that saturate NVMe SSD throughput.

🏆 Benchmarks: GraphZero vs. PyTorch Geometric

Task: Load ogbn-papers100M (56GB Raw) and perform random walks. Hardware: Windows Laptop (16GB RAM, NVMe SSD).

Metric GraphZero (v0.1) PyTorch Geometric
Load Time 0.000000 s FAILED (Crash) ❌
Peak RAM Usage ~5.1 GB (OS Cache) >24.1 GB (Required)
Throughput 1,264,000 steps/s N/A
Status Success OOM Error

Proof of Performance

Left: GraphZero loading instantly and utilizing OS Page Cache. Right: PyG crashing with Unable to allocate 24.1 GiB.


📦 Installation

GraphZero is available on PyPI (Pre-Alpha):

pip install graphzero

Requirements: Python 3.8+, C++17 Compiler (MSVC/GCC), OpenMP.


🚀 Quick Start

1. Convert Your Data

GraphZero uses a high-efficiency binary format (.gl). Convert your generic CSV edges list once.

import graphzero as gz

# Converts raw CSV (src, dst) to memory-mapped binary
# Handles 100M+ edges easily on minimal RAM
gz.convert_csv_to_gl(
    input_csv="dataset/edges.csv", 
    output_bin="graph.gl", 
    directed=True
)

2. High-Speed Sampling

Once converted, the graph is instantly accessible.

import graphzero as gz
import numpy as np

# 1. Zero-Copy Load (Instant)
g = gz.Graph("graph.gl")

# 2. Define Start Nodes (e.g., 1000 random nodes)
start_nodes = np.random.randint(0, g.num_nodes, 1000).astype(np.uint64)

# 3. Parallel Random Walk (node2vec / DeepWalk style)
# Returns: List of walks (flat or list-of-lists)
walks = g.batch_random_walk_uniform(
    start_nodes=start_nodes, 
    walk_length=10
)

print(f"Generated {len(walks)} steps instantly.")

⚙️ Under the Hood

GraphZero is built for Systems & GNN enthusiasts.

  • Core: C++20 with nanobind for Python bindings.
  • Parallelism: Uses #pragma omp with thread-local RNGs to prevent false sharing and lock contention.
  • IO: Direct CreateFileMapping (Windows) and mmap (Linux) calls with alignment optimization (4KB/2MB pages).

🗺️ Roadmap

  • v0.1 (Current): Topology-only support. Uniform Random Walks.
  • v0.2: Columnar Feature Store (mmap support for Node Features ).
  • v0.3: Weighted Edges & SIMD (AVX2) Neighbor Intersection.
  • v0.4: Dynamic Updates (LSM-Tree based mutable graphs).
  • v0.5: Pinned Memory Allocator for faster CPU GPU transfer.

📄 License

MIT License. Created by Krish Singaria (IIT Mandi).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graphzero-0.1.0.tar.gz (5.6 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

graphzero-0.1.0-cp312-cp312-win_amd64.whl (75.9 kB view details)

Uploaded CPython 3.12Windows x86-64

graphzero-0.1.0-cp312-cp312-win32.whl (68.9 kB view details)

Uploaded CPython 3.12Windows x86

graphzero-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (164.7 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

graphzero-0.1.0-cp312-cp312-macosx_11_0_arm64.whl (309.0 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

graphzero-0.1.0-cp311-cp311-win_amd64.whl (76.6 kB view details)

Uploaded CPython 3.11Windows x86-64

graphzero-0.1.0-cp311-cp311-win32.whl (69.5 kB view details)

Uploaded CPython 3.11Windows x86

graphzero-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (165.7 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

graphzero-0.1.0-cp311-cp311-macosx_11_0_arm64.whl (310.1 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

graphzero-0.1.0-cp310-cp310-win_amd64.whl (76.8 kB view details)

Uploaded CPython 3.10Windows x86-64

graphzero-0.1.0-cp310-cp310-win32.whl (69.6 kB view details)

Uploaded CPython 3.10Windows x86

graphzero-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (165.9 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

graphzero-0.1.0-cp310-cp310-macosx_11_0_arm64.whl (310.3 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

graphzero-0.1.0-cp39-cp39-win_amd64.whl (77.2 kB view details)

Uploaded CPython 3.9Windows x86-64

graphzero-0.1.0-cp39-cp39-win32.whl (70.7 kB view details)

Uploaded CPython 3.9Windows x86

graphzero-0.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (166.1 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.17+ x86-64

graphzero-0.1.0-cp39-cp39-macosx_11_0_arm64.whl (310.5 kB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

File details

Details for the file graphzero-0.1.0.tar.gz.

File metadata

  • Download URL: graphzero-0.1.0.tar.gz
  • Upload date:
  • Size: 5.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for graphzero-0.1.0.tar.gz
Algorithm Hash digest
SHA256 f0ba548e8cf6f035cf66b6a3574027e3b5c7778659863e4944d7f8b4edf0882c
MD5 1656259afbe58d01a113b07c2492ff9b
BLAKE2b-256 7d39c8b6896e89c762b68c4e9f9114f0ce31368edad1a557298e483254b7dd96

See more details on using hashes here.

File details

Details for the file graphzero-0.1.0-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: graphzero-0.1.0-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 75.9 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for graphzero-0.1.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 dcd110c0aaca5c3e031329ccccd0009cecb91064a9fb93770c4fbd65e6ffa2d2
MD5 d13339eaab3e8566a1f3cb27690dc932
BLAKE2b-256 1369cde887cc2233ddbaa7e7c911c8fe84420821405dfa24435c240e82878705

See more details on using hashes here.

File details

Details for the file graphzero-0.1.0-cp312-cp312-win32.whl.

File metadata

  • Download URL: graphzero-0.1.0-cp312-cp312-win32.whl
  • Upload date:
  • Size: 68.9 kB
  • Tags: CPython 3.12, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for graphzero-0.1.0-cp312-cp312-win32.whl
Algorithm Hash digest
SHA256 c1ce90fff1642c47ec298d5602e8e382d49bb5d73517364f013d7f422e8191cd
MD5 a74a74d17eef612cf86c02f4da3c3d68
BLAKE2b-256 939f69b9bd91912339f6f8ec2621a4083759475192aa3d2c34eb76ad05d2def2

See more details on using hashes here.

File details

Details for the file graphzero-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for graphzero-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0c42e26becde3008a0fea5186a41b5f1bf29607343ff6978c680287796d9e8d1
MD5 76832dffd277f6e3ecfe534b32b6e257
BLAKE2b-256 e7fc1b6c0289e5cf91a282ec78bc73c4047fc56e49d2b2de561b0fc19799350a

See more details on using hashes here.

File details

Details for the file graphzero-0.1.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for graphzero-0.1.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9061155ce3e1ae243360f68da29776327dca97431d6479c220b82e26a362ec87
MD5 ea373bf43d5f5a66ea2e1aeeac9fd81d
BLAKE2b-256 d96d3db5ed9e6bc8d0713978c8dd7642265390820afae5b2dddd7e357f43e381

See more details on using hashes here.

File details

Details for the file graphzero-0.1.0-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: graphzero-0.1.0-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 76.6 kB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for graphzero-0.1.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 e25ab0f1ec71e9c08aa74bc8394c988b53bc864186e42f3533bb6a0b4789afb0
MD5 8f0b006b70d9b9b8f20b783f39d3f1b0
BLAKE2b-256 c61dc6cf1f669e1e3768c8bf11b381804729f52701301c150f84a193e6b24bea

See more details on using hashes here.

File details

Details for the file graphzero-0.1.0-cp311-cp311-win32.whl.

File metadata

  • Download URL: graphzero-0.1.0-cp311-cp311-win32.whl
  • Upload date:
  • Size: 69.5 kB
  • Tags: CPython 3.11, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for graphzero-0.1.0-cp311-cp311-win32.whl
Algorithm Hash digest
SHA256 d9d67b64e5f3fc3871f99e45fef4c578330d4c88c8566fe8d287dc7d16d717ff
MD5 1436090a52cab8c27e5f8b62f54ee5bf
BLAKE2b-256 6d4fe86e42a13e666fe1941941b067f18303c69423f1e53cb6dbdbd055cb430f

See more details on using hashes here.

File details

Details for the file graphzero-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for graphzero-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4c6f49fbcf0cc54fd22b072efeefa5001575ab9f0a9e4548b7703528139fc7a9
MD5 df1c03fbceb0368beac445494bfc8a47
BLAKE2b-256 0b3e599c87e23d16a19051e7925339f9f87a5e5a916f95206341fb306a99e3eb

See more details on using hashes here.

File details

Details for the file graphzero-0.1.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for graphzero-0.1.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 66a2dfe243bc4b3f2bdc28f87ca9e42822a919e6617d509a18e493931c3ff938
MD5 36e7e101d38a6ce26164e36d251b8417
BLAKE2b-256 f2a28a19e31131ef558b3814f97022a81e1e54b5be5dd5f5f09ea0e1f5c7d8f2

See more details on using hashes here.

File details

Details for the file graphzero-0.1.0-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: graphzero-0.1.0-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 76.8 kB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for graphzero-0.1.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 cdaa99ebf409823ee404c0e6ff8874a94b8778591af47f6e0df19b6fb69746b2
MD5 8ebdb55bfb11fabbeddce009698e51d6
BLAKE2b-256 284f5e43eae1eca74bd2c8b37a1969049886c26a64f7ff24fac5623c45cb7d67

See more details on using hashes here.

File details

Details for the file graphzero-0.1.0-cp310-cp310-win32.whl.

File metadata

  • Download URL: graphzero-0.1.0-cp310-cp310-win32.whl
  • Upload date:
  • Size: 69.6 kB
  • Tags: CPython 3.10, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for graphzero-0.1.0-cp310-cp310-win32.whl
Algorithm Hash digest
SHA256 bcc73072678374cc2ad99e001fd65e019b5703759d23043e72f806709d035c85
MD5 328a2ed1987544d410ea06a035d562f0
BLAKE2b-256 28f3b7e8e7c1920d7f4edea7c5726fae6788c7a0bf97766c53be4a9bb5fe959c

See more details on using hashes here.

File details

Details for the file graphzero-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for graphzero-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0b4bb603328fe4a6a44aa7c57f5fcd1b7881b8b728e2796cfc0ad3d3c51efe04
MD5 5514da76e2ab08c0137d20b6bc9866c9
BLAKE2b-256 ec7c53a92a00696b8be9613eece5615ac10261989a07dc3ab01125325fdb97a9

See more details on using hashes here.

File details

Details for the file graphzero-0.1.0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for graphzero-0.1.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 637677630cf07e562a62978bce792ac29e398eb9fab6aa8813f2cfadb83fd3ba
MD5 5327d0775c0f34395fafe4ad7429ee78
BLAKE2b-256 500f82fea3cfe53c252b03caf36ccaa794a143c679aba28fdce8e1460c0d5ef5

See more details on using hashes here.

File details

Details for the file graphzero-0.1.0-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: graphzero-0.1.0-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 77.2 kB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for graphzero-0.1.0-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 83c3006350ab1001714f89172b9c59982ad2b5deea7246f91c71b1899cb9a72b
MD5 6e725a844cc6701bad7b5350e0cc1c50
BLAKE2b-256 c281af85f7408e3094442db45ca772585ae9dd8bfe13113f32373c3c7f806245

See more details on using hashes here.

File details

Details for the file graphzero-0.1.0-cp39-cp39-win32.whl.

File metadata

  • Download URL: graphzero-0.1.0-cp39-cp39-win32.whl
  • Upload date:
  • Size: 70.7 kB
  • Tags: CPython 3.9, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for graphzero-0.1.0-cp39-cp39-win32.whl
Algorithm Hash digest
SHA256 e018a98a7cd1863775f26624a8fed1ae4781650483499f1e10a4ee28d1737156
MD5 8716de0f2e6a6010ee97e3766cf1c0cd
BLAKE2b-256 4f5a7ca8e6f30dbfb91d5bc03238212bf11d60a1fcd75e42d275d37019f65c6c

See more details on using hashes here.

File details

Details for the file graphzero-0.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for graphzero-0.1.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 57e9879b80503a53357d162f56d539389ec242cf011ec0f44a0406a7a7bdbc96
MD5 f527256e3dd5767641cf80b59073cc9c
BLAKE2b-256 05df0d016f21ee9fe12dc53ca1f8a965c5ffe609cc977797b7a6e542746eaf2a

See more details on using hashes here.

File details

Details for the file graphzero-0.1.0-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for graphzero-0.1.0-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6f1e9cc8511f6018d96baaa151ffbef9561be64ccf92849ff0a1621889d33b1c
MD5 3c6d2f4ef2ee07b73674d4531266eed7
BLAKE2b-256 192e46fb17e810dd79853d0927eed664d34e62a7afcf0d60b4fc6d40cbfefeae

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page