A graph embedding library with PyTorch and RAPIDS acceleration

These details have not been verified by PyPI

Project links

Project description

graphem rapids logo

GraphEm Rapids: High-Performance Graph Embedding

GraphEm Rapids is a high-performance implementation of the GraphEm graph embedding library, with PyTorch and RAPIDS for enhanced scalability and GPU acceleration.

Key Features

Multiple Backends: PyTorch, RAPIDS cuVS, and CPU fallback
Automatic Backend Selection: Optimal backend chosen based on data size and hardware
Large-Scale Support: Handles graphs with millions of vertices using RAPIDS
Memory Efficient: Adaptive chunking and memory management
GPU Accelerated: Full CUDA support with PyTorch and RAPIDS

Installation

Basic Installation (PyTorch backend)

pip install graphem-rapids

With CUDA Support

pip install graphem-rapids[cuda]

With Full RAPIDS Support

pip install graphem-rapids[rapids]
# or for everything
pip install graphem-rapids[all]

Development Installation

git clone https://github.com/sashakolpakov/graphem-rapids.git
cd graphem-rapids
pip install -e .

Quick Start

Automatic Backend Selection

import graphem_rapids as gr

# Generate a graph
edges = gr.erdos_renyi_graph(n=10000, p=0.001)

# Create embedder with automatic backend selection
embedder = gr.create_graphem(edges, n_vertices=10000, dimension=3)

# Run layout
embedder.run_layout(num_iterations=50)

# Display
embedder.display_layout()

Explicit Backend Selection

# Force PyTorch backend
embedder = gr.GraphEmbedderPyTorch(
    edges, n_vertices=10000, dimension=3,
    device='cuda'  # or 'cpu'
)

# Force RAPIDS cuVS backend (for large graphs)
embedder = gr.GraphEmbedderCuVS(
    edges, n_vertices=100000, dimension=3,
    index_type='ivf_flat'
)

Backend Information

# Check available backends
info = gr.get_backend_info()
print(f"CUDA available: {info['cuda_available']}")
print(f"Recommended: {info['recommended_backend']}")

Architecture

GraphEm Rapids provides multiple computational backends:

PyTorch Backend

Best for: Medium-scale graphs (1K-100K vertices)
Features: CUDA acceleration, memory-efficient chunking
Fallback: Automatic CPU mode when GPU unavailable

RAPIDS cuVS Backend

Best for: Large-scale graphs (100K+ vertices)
Features: Optimized KNN with cuVS indices, CuPy operations
Index Types: Brute force, IVF-Flat, IVF-PQ (automatic selection)

Automatic Selection

The create_graphem() function automatically selects the optimal backend based on:

Dataset size (number of vertices)
Available hardware (CUDA, RAPIDS)
Memory constraints
User preferences

Configuration

Environment Variables

export GRAPHEM_BACKEND=pytorch     # Force backend
export GRAPHEM_PREFER_GPU=true     # Prefer GPU backends
export GRAPHEM_MEMORY_LIMIT=8      # Memory limit in GB
export GRAPHEM_VERBOSE=true        # Verbose logging
export GRAPHEM_RAPIDS_QUIET=true   # Suppress startup messages

Programmatic Configuration

from graphem_rapids.utils import BackendConfig

config = BackendConfig(
    n_vertices=50000,
    dimension=3,
    force_backend='cuvs',
    memory_limit=16.0,  # GB
    prefer_gpu=True
)

embedder = gr.create_graphem(edges, n_vertices=50000, **config.__dict__)

Influence Maximization

GraphEm Rapids maintains full compatibility with influence maximization algorithms:

# Select influential nodes using embedding-based method
seeds = gr.graphem_seed_selection(embedder, k=10)

# Compare with traditional methods
import networkx as nx
G = nx.from_edgelist(edges)
influence, _ = gr.ndlib_estimated_influence(G, seeds, p=0.1)
print(f"Estimated influence: {influence} nodes")

Testing

Run the test suite:

pytest tests/ -v

Test specific backends:

pytest tests/test_pytorch_backend.py
pytest tests/test_cuvs_backend.py

Benchmarking

Run performance benchmarks:

python benchmarks/run_benchmarks.py

Compare backends:

python benchmarks/compare_backends.py --sizes 1000,10000,100000

Advanced Usage

Custom Memory Management

from graphem_rapids.utils import MemoryManager

with MemoryManager(cleanup_on_exit=True):
    embedder = gr.create_graphem(edges, n_vertices=50000)
    embedder.run_layout(50)
    # Automatic cleanup on exit

Chunked Processing for Large Graphs

from graphem_rapids.utils import get_optimal_chunk_size

chunk_size = get_optimal_chunk_size(n_vertices=1000000, dimension=3)
embedder = gr.GraphEmbedderPyTorch(
    edges, n_vertices=1000000,
    batch_size=chunk_size,
    memory_efficient=True
)

cuVS Index Configuration

embedder = gr.GraphEmbedderCuVS(
    edges, n_vertices=500000,
    index_type='ivf_pq',  # Options: 'brute_force', 'ivf_flat', 'ivf_pq'
    sample_size=2048,     # Larger samples for better accuracy
    batch_size=8192       # Larger batches for better throughput
)

Documentation

License

MIT License - see LICENSE file.

Citation

If you use GraphEm Rapids in your research, please cite:

@misc{kolpakov-rivin-2025fast,
  title={Fast Geometric Embedding for Node Influence Maximization},
  author={Kolpakov, Alexander and Rivin, Igor},
  year={2025},
  eprint={2506.07435},
  archivePrefix={arXiv},
  primaryClass={cs.SI},
  url={https://arxiv.org/abs/2506.07435}
}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.1

Nov 9, 2025

This version

0.1.0

Sep 27, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

graphem_rapids-0.1.0.tar.gz (48.4 kB view details)

Uploaded Sep 27, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

graphem_rapids-0.1.0-py3-none-any.whl (40.8 kB view details)

Uploaded Sep 27, 2025 Python 3

File details

Details for the file graphem_rapids-0.1.0.tar.gz.

File metadata

Download URL: graphem_rapids-0.1.0.tar.gz
Upload date: Sep 27, 2025
Size: 48.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.8.10

File hashes

Hashes for graphem_rapids-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`6e3d1dab2a567c487b81eb7a5288fc8941dc8da5213829d2199994bcd0e894f4`
MD5	`5b8105273f37d3c0fa9cdcfb1661d3b6`
BLAKE2b-256	`72f5e987df4adf5c8099452ad23b30d33f2037a76544adbeed9dac39d971bfff`

See more details on using hashes here.

File details

Details for the file graphem_rapids-0.1.0-py3-none-any.whl.

File metadata

Download URL: graphem_rapids-0.1.0-py3-none-any.whl
Upload date: Sep 27, 2025
Size: 40.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.8.10

File hashes

Hashes for graphem_rapids-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d0144983f64d7dddee50babb3af8c6182bc56e70da52513ed87f729f2e9b7715`
MD5	`9da54a80a80ac52b8fc70f7648a2281d`
BLAKE2b-256	`02f51a173ee3eaa94eec253be776f66f23366a5779992ba61769abe85bd117c1`

See more details on using hashes here.

graphem-rapids 0.1.0

Navigation

Verified details

Owner

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Project description

GraphEm Rapids: High-Performance Graph Embedding

Key Features

Installation

Basic Installation (PyTorch backend)

With CUDA Support

With Full RAPIDS Support

Development Installation

Quick Start

Automatic Backend Selection

Explicit Backend Selection

Backend Information

Architecture

PyTorch Backend

RAPIDS cuVS Backend

Automatic Selection

Configuration

Environment Variables

Programmatic Configuration

Influence Maximization

Testing

Benchmarking

Advanced Usage

Custom Memory Management

Chunked Processing for Large Graphs

cuVS Index Configuration

Documentation

License

Citation

Project details

Verified details

Owner

Maintainers

Meta

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes