Skip to main content

Scalable Agent-based GPU Enabled Simulator.

Project description

SAGESim

SAGESim - Scalable Agent-based GPU-Enabled Simulator

SAGESim is the first scalable, pure-Python, general-purpose agent-based modeling framework that supports both distributed computing and GPU acceleration. Designed for high-performance computing (HPC) environments, SAGESim enables simulations with millions of agents by combining MPI-level parallelism across multiple GPUs with GPU-level parallelism using thousands of threads per device.

Key Features

  • Dual-Level Parallelism: MPI distribution across multiple GPUs + GPU thread parallelism for individual agents
  • Pure Python: Write agent behaviors in Python using CuPy's JIT-compiled GPU kernels
  • Scalable: From laptop GPUs to HPC clusters with thousands of GPUs
  • Network-Based Models: Built-in support for agent networks with automatic neighbor data synchronization
  • Double Buffering: Race condition prevention for concurrent agent interactions
  • Graph Partitioning: Load pre-computed partitions to minimize cross-worker communication
  • Flexible Properties: Support for scalar and nested list properties with automatic padding

Requirements

  • Python 3.11+
  • NVIDIA GPU with CUDA drivers or AMD GPU with ROCm 5.7.1+
  • MPI implementation (OpenMPI, MPICH, etc.)

Installation

Your system might require specific steps to install mpi4py and/or cupy depending on your hardware. In that case, use your system's recommended instructions to install these dependencies first.

# Install SAGESim
pip install sagesim

# Or install from source
git clone https://github.com/ORNL/sagesim.git
cd sagesim
pip install -e .

Dependencies

  • cupy - GPU array computing
  • mpi4py - MPI bindings for Python
  • networkx - Graph/network handling
  • numpy - CPU array operations
  • awkward - Ragged array support

Quick Start

1. Define a Breed (Agent Type)

from cupyx import jit
from sagesim.breed import Breed

@jit.rawkernel(device="cuda")
def my_step_func(tick, agent_index, globals, agent_ids, breeds, locations, health):
    """Agent behavior: heal by 1 each tick"""
    health[agent_index] = health[agent_index] + 1

class MyBreed(Breed):
    def __init__(self):
        super().__init__("MyBreed")
        self.register_property("health", 100)  # Initial value
        self.register_step_func(my_step_func, __file__, priority=0)

2. Define a Model

from sagesim.model import Model
from sagesim.space import NetworkSpace

class MyModel(Model):
    def __init__(self):
        super().__init__(NetworkSpace())
        self._breed = MyBreed()
        self.register_breed(self._breed)

    def create_agent(self, health):
        return self.create_agent_of_breed(self._breed, health=health)

    def connect_agents(self, agent_a, agent_b):
        self.get_space().connect_agents(agent_a, agent_b)

3. Run the Simulation

# Create model and agents
model = MyModel()
for i in range(1000):
    model.create_agent(health=100)

# Connect agents in a network
for i in range(999):
    model.connect_agents(i, i + 1)

# Setup and run
model.setup(use_gpu=True)
model.simulate(ticks=100, sync_workers_every_n_ticks=1)

4. Run with MPI (Multiple GPUs)

mpirun -n 4 python my_simulation.py

Run Example: SIR Epidemic Model

git clone https://github.com/ORNL/sagesim.git
cd sagesim/examples/sir
mpirun -n 4 python run.py --num_agents 10000 --percent_init_connections 0.1 --num_nodes 1

Documentation

Comprehensive documentation is available in the docs/ directory:

Document Description
Architecture Overview System design, MPI distribution, GPU threading
Getting Started Step-by-step guide to building models
Double Buffering Race condition prevention mechanisms
Network Partitioning Loading pre-computed partitions for load balancing
Runtime Optimizations Performance tuning techniques
Selective Sync Reducing MPI overhead
Property History Tracking property changes over time
Ordered Neighbors Ordered neighbor storage for agent networks
GPU-CPU Data Flow Data flow between CPU and GPU

HPC Deployment

SAGESim is designed for HPC clusters. Example SLURM script for ORNL Frontier:

#!/bin/bash
#SBATCH -N 10
#SBATCH -t 00:30:00

num_nodes=10
num_mpi_ranks=$((8 * num_nodes))  # 8 GPUs per node

srun -N${num_nodes} -n${num_mpi_ranks} -c7 \
     --ntasks-per-gpu=1 --gpu-bind=closest \
     python3 -u ./run.py

CuPy JIT Kernel Limitations

When writing step functions, be aware of these cupyx.jit.rawkernel constraints:

  • NaN checks: Use x != x (inequality to self)
  • No dicts/objects: Only primitive types and arrays
  • No *args/**kwargs: Fixed argument lists only
  • No nested functions: Define helpers at module level
  • Use CuPy, not NumPy: Use cupy data types and routines in kernels
  • for loops: Must use range() iterator only
  • No return: Side effects via array writes only
  • No break/continue: Use boolean flags instead
  • No variable reassignment in scopes: Declare at top level
  • No -1 indexing: Use len(array) - 1 instead

See CuPy documentation for supported operations.

Project Structure

sagesim/
├── sagesim/           # Core library
│   ├── model.py       # Model class, simulation loop, GPU kernel generation
│   ├── agent.py       # Agent factory, MPI data synchronization
│   ├── breed.py       # Breed definition, property registration
│   ├── space.py       # NetworkSpace for agent topology
│   └── internal_utils.py  # Array conversion utilities
├── examples/          # Example models (SIR epidemic model)
├── docs/              # Comprehensive documentation
└── tests/             # Test suite

Contributing

Contributions are welcome! Please see the GitHub repository for issues and pull requests.

License

MIT License - Oak Ridge National Laboratory

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sagesim-0.6.0.tar.gz (89.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sagesim-0.6.0-py3-none-any.whl (94.2 kB view details)

Uploaded Python 3

File details

Details for the file sagesim-0.6.0.tar.gz.

File metadata

  • Download URL: sagesim-0.6.0.tar.gz
  • Upload date:
  • Size: 89.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for sagesim-0.6.0.tar.gz
Algorithm Hash digest
SHA256 2a00853a9851a75451a3f04529efb3586682a9a85643be416b391c4192cc833b
MD5 345f0442aca811ad7aff27e35e5f2b66
BLAKE2b-256 c6b545bbde99eaacb3de06bb8b759adba930043ea46a11154e83ec072fa589fa

See more details on using hashes here.

File details

Details for the file sagesim-0.6.0-py3-none-any.whl.

File metadata

  • Download URL: sagesim-0.6.0-py3-none-any.whl
  • Upload date:
  • Size: 94.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for sagesim-0.6.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c96521e2909cb2487d2f06881c4e4bc13bb6a60c3a3d9ef63f7063c7c5c9dd95
MD5 51ebc75f5cb5e24b2f6a952487846c75
BLAKE2b-256 b9d3154b8a117f1b0d4cc3312fd8a16f117294f87a1b8b63807ea3343a515a8c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page