Skip to main content

Fair GPU time-sharing with automatic expiration and concurrent execution support

Project description

Chronos

Fair GPU Time-Sharing for Everyone

CI License PyPI Documentation

Time-based GPU partitioning with automatic expiration Simple. Fair. Just works.™

from chronos import Partitioner

# Get 50% of GPU 0 for 1 hour - guaranteed
with Partitioner().create(device=0, memory=0.5, duration=3600) as partition:
    train_model()  # Your code here
    # Auto-cleanup when done

The Problem

You have one expensive GPU and multiple users who need it.

Without Chronos:

❌ Resource conflicts and crashes
❌ No fair allocation
❌ Manual coordination required
❌ Wasted compute time
❌ Politics and frustration

With Chronos:

✅ Everyone gets guaranteed time
✅ Automatic resource cleanup
✅ Zero conflicts
✅ < 1% performance overhead
✅ No manual coordination

Quick Start

Install

# PyPI (recommended)
pip install chronos-gpu

# Or quick script
curl -sSL https://raw.githubusercontent.com/oabraham1/chronos/main/install.sh | sudo bash

# Or from source
git clone https://github.com/oabraham1/chronos
cd chronos && ./install-quick.sh

Use (CLI)

# Check your GPUs
chronos stats

# Allocate 50% of GPU 0 for 1 hour
chronos create 0 0.5 3600

# List active partitions
chronos list

# It auto-expires - no cleanup needed!

Use (Python)

from chronos import Partitioner

p = Partitioner()

# Simple usage
with p.create(device=0, memory=0.5, duration=3600) as partition:
    import torch
    model = torch.nn.Sequential(...).cuda()
    model.fit(X, y)
    # Automatic cleanup

Why Chronos?

🎯 Fair Allocation

Time-based partitions mean no resource hogging. Everyone gets their fair share.

⚡ Ultra-Fast

  • 3.2ms partition creation
  • < 1% GPU overhead
  • Sub-second expiration accuracy

🔒 Isolated & Safe

  • Per-user partitions
  • Memory enforcement
  • Automatic expiration
  • No manual cleanup

🌍 Universal

  • Any GPU: NVIDIA, AMD, Intel, Apple Silicon
  • Any OS: Linux, macOS, Windows
  • Any Framework: PyTorch, TensorFlow, JAX, etc.

🎓 Perfect For

  • Research labs with shared GPUs
  • Small teams with limited hardware
  • Universities with many students
  • Development environments

Features

Feature Chronos NVIDIA MIG MPS Time-Slicing
Time-based allocation
Auto-expiration
Multi-vendor GPU
User isolation
Zero setup
< 1% overhead

Execution Modes

Chronos automatically selects the best execution backend for your system:

Concurrent Mode (NVIDIA MPS)

  • True parallel execution - Multiple partitions run simultaneously
  • Hardware-enforced limits - GPU resources physically partitioned
  • Requirements: NVIDIA GPU with Compute Capability 3.5+, CUDA driver

Time-Sliced Mode (OpenCL)

  • Context switching - Partitions take turns on the GPU
  • Cross-vendor support - Works with NVIDIA, AMD, Intel, Apple Silicon
  • Requirements: OpenCL 1.2+ runtime

Check Your Mode

from chronos import Partitioner, check_concurrent_support

# Check available backends
print(check_concurrent_support())
# {'nvidia_mps': True, 'rocm': False, 'opencl': True,
#  'concurrent_possible': True, 'recommended': 'nvidia_mps'}

# Check active mode
p = Partitioner()
print(f"Backend: {p.get_backend_name()}")  # "NVIDIA MPS" or "OpenCL"
print(f"Mode: {p.get_execution_mode()}")    # "concurrent" or "time_sliced"

Force a Specific Backend

# Force OpenCL even on NVIDIA systems
export CHRONOS_BACKEND=opencl

# Force MPS (will fail if not available)
export CHRONOS_BACKEND=mps

Examples

Research Lab Setup

#!/bin/bash
# Allocate GPU for the team every morning

chronos create 0 0.30 28800 --user alice   # 30%, 8 hours
chronos create 0 0.20 28800 --user bob     # 20%, 8 hours
chronos create 0 0.15 28800 --user carol   # 15%, 8 hours
# 35% left for ad-hoc use

ML Training with Auto-Save

from chronos import Partitioner
import torch

with Partitioner().create(device=0, memory=0.5, duration=14400) as p:
    model = MyModel().cuda()

    for epoch in range(1000):
        train_epoch(model)

        # Auto-save when time is running out
        if p.time_remaining < 600:  # 10 minutes left
            torch.save(model.state_dict(), 'checkpoint.pt')
            print("Checkpoint saved!")
            break

Jupyter Notebook

from chronos import Partitioner

# At the start of your notebook
p = Partitioner()
partition = p.create(device=0, memory=0.5, duration=7200)  # 2 hours

# Your analysis here
import tensorflow as tf
model = build_model()
model.fit(data)

# Check remaining time
print(f"Time left: {partition.time_remaining}s")

# Release when done (or it auto-expires)
partition.release()

Performance

Benchmarked on Ubuntu 22.04 with NVIDIA RTX 3080:

Operation Latency Overhead
Create partition 3.2ms ± 0.5ms -
Release partition 1.8ms ± 0.3ms -
GPU compute - 0.8%
Memory tracking 0.1ms -

24-hour stress test: 1.2M operations, zero failures, zero memory leaks.

Full benchmarks →


Documentation


Installation Methods

PyPI (Recommended)

pip install chronos-gpu

Quick Install Script

# Linux/macOS
curl -sSL https://raw.githubusercontent.com/oabraham1/chronos/main/install.sh | sudo bash

# Or user install (no sudo)
curl -sSL https://raw.githubusercontent.com/oabraham1/chronos/main/install-user.sh | bash

Docker

docker pull ghcr.io/oabraham1/chronos:latest
docker run --gpus all ghcr.io/oabraham1/chronos:latest chronos stats

From Source

git clone https://github.com/oabraham1/chronos
cd chronos
mkdir build && cd build
cmake .. && make
sudo make install

Full installation guide →


Architecture

┌─────────────────────────────────────────┐
│           User Applications             │
│    (PyTorch, TensorFlow, JAX, etc.)     │
└──────────────┬──────────────────────────┘
               │
┌──────────────▼──────────────────────────┐
│         Chronos Partitioner             │
│  ┌──────────────────────────────────┐   │
│  │  Time-Based Allocation Engine    │   │
│  └──────────────────────────────────┘   │
│  ┌──────────────────────────────────┐   │
│  │    Memory Enforcement Layer      │   │
│  └──────────────────────────────────┘   │
│  ┌──────────────────────────────────┐   │
│  │   Auto-Expiration Monitor        │   │
│  └──────────────────────────────────┘   │
└──────────────┬──────────────────────────┘
               │
┌──────────────▼──────────────────────────┐
│          Backend Selector               │
│   Auto-detects best execution mode      │
└──────────────┬──────────────────────────┘
         ┌─────┼─────┬─────────┐
         ▼     ▼     ▼         ▼
┌──────────┐ ┌──────┐ ┌──────┐ ┌──────┐
│NVIDIA MPS│ │ ROCm │ │OpenCL│ │ Stub │
│(Concurrent)│ │(AMD) │ │(All) │ │(Test)│
└─────┬────┘ └──┬───┘ └──┬───┘ └──────┘
      │         │        │
      ▼         ▼        ▼
┌─────────────────────────────────────────┐
│      GPU Hardware (Any Vendor)          │
└─────────────────────────────────────────┘

Key Components:

  • C++ Core: High-performance partition management
  • Backend Selector: Auto-detects optimal execution mode
  • Multiple Backends: NVIDIA MPS, AMD ROCm, OpenCL, Stub
  • Python Bindings: Easy-to-use API
  • CLI Tool: Command-line interface
  • Monitor Thread: Automatic expiration handling
  • Lock Files: Inter-process coordination

Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines.

Good first issues:

  • Add more examples
  • Improve error messages
  • Write tests
  • Update documentation
  • Fix bugs

Citation

If you use Chronos in research, please cite:

@software{chronos2025,
  title={Chronos: Time-Based GPU Partitioning for Fair Resource Sharing},
  author={Abraham, Ojima},
  year={2025},
  url={https://github.com/oabraham1/chronos},
  version={1.1.0}
}

License

Apache License 2.0 - Use it anywhere, for anything.

See LICENSE for full terms.


Support


Acknowledgments

Thanks to all contributors and early adopters who helped shape Chronos!

Special thanks to the open-source community for inspiration and support.


Made with ❤️ by researchers, for researchers

⭐ Star us on GitHub📦 Install from PyPI📚 Read the docs

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chronos_gpu-1.1.0.tar.gz (73.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

chronos_gpu-1.1.0-py3-none-any.whl (669.2 kB view details)

Uploaded Python 3

File details

Details for the file chronos_gpu-1.1.0.tar.gz.

File metadata

  • Download URL: chronos_gpu-1.1.0.tar.gz
  • Upload date:
  • Size: 73.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for chronos_gpu-1.1.0.tar.gz
Algorithm Hash digest
SHA256 00827ff83329e1c491eacde746c3b2c5e632ff2837f04ec394780febcd3ec524
MD5 b3a663e4c40a52a7258a7d619f945066
BLAKE2b-256 3c2b7747c2407227ff5d78ea1c89bf11bbd7c8966d967643501e4c18cd0353f8

See more details on using hashes here.

File details

Details for the file chronos_gpu-1.1.0-py3-none-any.whl.

File metadata

  • Download URL: chronos_gpu-1.1.0-py3-none-any.whl
  • Upload date:
  • Size: 669.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for chronos_gpu-1.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 96cba7a669c00e8846ca265eb38b0090c94ef2d54e7bf80cfedd2e3721b13bee
MD5 880e167c9b5c60033a82a2286249d11e
BLAKE2b-256 fd5ab0eb447de766d48e78c3a532f3ef5011ed7141f59731ea4a659e72da3b84

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page