A graph embedding library with PyTorch and RAPIDS acceleration
Project description
GraphEm Rapids: High-Performance Graph Embedding
High-performance GraphEm implementation using PyTorch and RAPIDS cuVS. Force-directed layout with geometric intersection detection produces embeddings that correlate strongly with centrality measures.
Features
- Unified API: Scipy sparse adjacency matrices, sklearn-style parameters (
n_components,n_neighbors) - Multiple Backends: PyTorch (1K-100K vertices), RAPIDS cuVS (100K+ vertices), automatic selection
- GPU Acceleration: CUDA support, memory-efficient chunking, automatic CPU fallback
- Graph Generators: Erdős-Rényi, scale-free, SBM, bipartite, Delaunay, and more
- Influence Maximization: Fast embedding-based seed selection
Installation
pip install graphem-rapids # PyTorch backend
pip install graphem-rapids[cuda] # + CUDA support
pip install graphem-rapids[rapids] # + RAPIDS cuVS
pip install graphem-rapids[all] # Everything
Quick Start
import graphem_rapids as gr
# Generate graph (returns sparse adjacency matrix)
adjacency = gr.generate_er(n=1000, p=0.01)
# Create embedder (automatic backend selection)
embedder = gr.create_graphem(adjacency, n_components=3)
# Run layout
embedder.run_layout(num_iterations=50)
# Get positions and visualize
positions = embedder.get_positions() # numpy array (n, d)
embedder.display_layout() # 2D or 3D plot
Backend Selection
Automatic (Recommended)
embedder = gr.create_graphem(adjacency, n_components=3)
Explicit PyTorch
embedder = gr.GraphEmbedderPyTorch(
adjacency, n_components=3, device='cuda',
L_min=1.0, k_attr=0.2, k_inter=0.5, n_neighbors=10,
batch_size=None # Automatic (or manual: 1024)
)
Explicit RAPIDS cuVS
embedder = gr.GraphEmbedderCuVS(
adjacency, n_components=3,
index_type='auto', # 'brute_force', 'ivf_flat', 'ivf_pq'
sample_size=1024, batch_size=None
)
Index Types: brute_force (<100K), ivf_flat (100K-1M), ivf_pq (>1M vertices)
Check Backends
info = gr.get_backend_info()
print(f"CUDA: {info['cuda_available']}, Recommended: {info['recommended_backend']}")
Configuration
Environment Variables:
export GRAPHEM_BACKEND=pytorch # Force backend
export GRAPHEM_PREFER_GPU=true # Prefer GPU
export GRAPHEM_MEMORY_LIMIT=8 # GB
export GRAPHEM_VERBOSE=true
Programmatic:
from graphem_rapids.utils.backend_selection import BackendConfig, get_optimal_backend
config = BackendConfig(n_vertices=50000, force_backend='cuvs', memory_limit=16.0)
backend = get_optimal_backend(config)
embedder = gr.create_graphem(adjacency, backend=backend)
Graph Generators
All generators return scipy sparse adjacency matrices:
# Random
gr.generate_er(n=1000, p=0.01, seed=42)
gr.generate_random_regular(n=100, d=3, seed=42)
# Scale-free & small-world
gr.generate_ba(n=300, m=3, seed=42) # Barabási-Albert
gr.generate_ws(n=1000, k=6, p=0.3, seed=42) # Watts-Strogatz
gr.generate_scale_free(n=100, seed=42)
# Community structures
gr.generate_sbm(n_per_block=75, num_blocks=4, p_in=0.15, p_out=0.01, seed=42)
gr.generate_caveman(l=10, k=10)
gr.generate_relaxed_caveman(l=10, k=10, p=0.1, seed=42)
# Bipartite
gr.generate_bipartite_graph(n_top=50, n_bottom=100, p=0.2, seed=42)
gr.generate_complete_bipartite_graph(n_top=50, n_bottom=100)
# Geometric
gr.generate_geometric(n=100, radius=0.2, dim=2, seed=42)
gr.generate_delaunay_triangulation(n=100, seed=42)
gr.generate_road_network(width=30, height=30) # 2D grid
# Trees
gr.generate_balanced_tree(r=2, h=10)
Influence Maximization
adjacency = gr.generate_er(n=1000, p=0.01)
embedder = gr.create_graphem(adjacency, n_components=3)
embedder.run_layout(num_iterations=50)
# Fast: embedding-based selection
seeds = gr.graphem_seed_selection(embedder, k=10)
# Evaluate with Independent Cascade model
import networkx as nx
G = nx.from_scipy_sparse_array(adjacency)
influence, _ = gr.ndlib_estimated_influence(G, seeds, p=0.1, iterations_count=100)
# Compare with greedy (slow, optimal)
greedy_seeds, _ = gr.greedy_seed_selection(G, k=10, p=0.1)
Advanced
Memory Management
from graphem_rapids.utils.memory_management import MemoryManager, get_gpu_memory_info
mem_info = get_gpu_memory_info()
print(f"GPU: {mem_info['free']:.1f}GB free / {mem_info['total']:.1f}GB total")
adjacency = gr.generate_er(n=1000, p=0.01)
with MemoryManager(cleanup_on_exit=True):
embedder = gr.create_graphem(adjacency)
embedder.run_layout(50)
Batch Size Tuning
from graphem_rapids.utils.memory_management import get_optimal_chunk_size
adjacency = gr.generate_er(n=1000, p=0.01)
# Automatic (recommended)
embedder = gr.GraphEmbedderPyTorch(adjacency, batch_size=None)
# Manual
embedder = gr.GraphEmbedderPyTorch(adjacency, batch_size=1024)
# Programmatic
optimal = get_optimal_chunk_size(n_vertices=1000000, n_components=3, backend='pytorch')
embedder = gr.GraphEmbedderPyTorch(adjacency, batch_size=optimal)
Testing & Benchmarking
pytest # Run all tests
pytest tests/test_pytorch_backend.py # Specific backend
python benchmarks/run_benchmarks.py # Performance tests
python benchmarks/compare_backends.py --sizes 1000,10000,100000
Contributing
See CONTRIBUTING.md for development setup, testing, and contribution guidelines.
Citation
@misc{kolpakov-rivin-2025fast,
title={Fast Geometric Embedding for Node Influence Maximization},
author={Kolpakov, Alexander and Rivin, Igor},
year={2025},
eprint={2506.07435},
archivePrefix={arXiv},
primaryClass={cs.SI},
url={https://arxiv.org/abs/2506.07435}
}
License
MIT License - see LICENSE file.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file graphem_rapids-0.2.1.tar.gz.
File metadata
- Download URL: graphem_rapids-0.2.1.tar.gz
- Upload date:
- Size: 74.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9352649999b8bb1b5c743111d2924de6ab62d68278f41cae7fff0837a8acef19
|
|
| MD5 |
f9f301ae92fbb86c4a90c503bd07b531
|
|
| BLAKE2b-256 |
9323b38514c164722315cf1a47f7f007f5870ff7d3a618634b5a06f1918fb116
|
File details
Details for the file graphem_rapids-0.2.1-py3-none-any.whl.
File metadata
- Download URL: graphem_rapids-0.2.1-py3-none-any.whl
- Upload date:
- Size: 45.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
296098d05cca77955273b622a3f6a5cf208a8045307176e1fab9dea2270dfa45
|
|
| MD5 |
c628f16b775d3dc173ac1de90af5863a
|
|
| BLAKE2b-256 |
5b1e5e8cc2403b9d0d2c0c57a195e3079a9a57b94d4775b3336ef6906ec8d8e4
|