Rebellions Extension for PyTorch

These details have not been verified by PyPI

Project description

PyTorch for Rebellions' NPU

This package provides PyTorch integration for Rebellions' NPU.

Getting Started (`torch`: Python package, `rebel-compiler`: Python package)

Prerequisites

Python 3.9 or later
Git
CMake 3.18 or later
Ninja build system
LDAP credentials for Rebellions' package repository

Update Git submodules

Clone submodules recursively. This will download required third-party libraries such as Rebel Compiler headers from third_party/rebel_compiler.

git submodule update --init ./

Create Python Virtual Environment

Create Python virtual environment. This will create a directory named .venv in the current directory.

python3 -m venv ./.venv && source ./.venv/bin/activate

Install Dependencies

Install Python package manager poetry. This will manage dependencies, building, packaging and installing.

pip3 install poetry==2.0.1

Save credentials for https://gate-keeper.rebellions.in. Authorization for https://pypi.rbln.in is required. But, this is not safe way to save credentials. See ~/.config/pypoetry/auth.toml.

export LDAP_USERNAME=daekyeong.kim     # Put your username
export LDAP_PASSWORD=mysecretpassword  # Put your password
poetry config keyring.enabled false    # Optional, if building freezes while auth
poetry config http-basic.rbln-internal $LDAP_USERNAME $LDAP_PASSWORD

NOTE During development we have to use rbln-internal instead of rbln. If you want to download rebel compiler from rbln (external pypi server of rebellions), do the following.

poetry config http-basic.rbln <rbln username> <rbln password>

Install dependencies written in poetry.lock using poetry, except the root package torch-rbln. Be careful, below command uninstall packages which is not on poetry.lock.

poetry sync --no-root

Choose Build Type (Optional)

Choose build type like below. Default is Release.

export RBLN_BUILD_TYPE=Debug

Install Editable Package

Build C++ project and install editable torch-rbln package.

poetry install --only-root

Logging

torch-rbln provides structured logging via spdlog to help diagnose runtime behavior, including CPU fallback operations and device execution traces.

Environment Variables

Variable	Description	Default
`TORCH_RBLN_LOG_LEVEL`	Controls log verbosity	`WARNING`
`TORCH_RBLN_LOG_PATH`	Log file path (debug builds only)	`./torch_rbln.log`

export TORCH_RBLN_LOG_LEVEL=INFO
export TORCH_RBLN_LOG_PATH=./torch_rbln.log

A log file is always created in debug builds. Its path can be configured via TORCH_RBLN_LOG_PATH environment variable.

Log Levels

Level	Description	Use Case
`DEBUG`	Detailed internal states, function entry/exit, parameter values	Deep debugging during development (debug builds only)
`INFO`	Runtime information, CPU fallback notifications	General development and troubleshooting
`WARNING` (default)	Important warnings that may affect execution	Production monitoring
`ERROR`	Errors and critical failures	Error tracking and alerting

Debug vs Release Builds

Feature	Debug Build	Release Build
Minimum log level	`DEBUG`	`INFO`
Log file	✅ Written to `TORCH_RBLN_LOG_PATH`	❌ Not available
Source location	✅ Included	❌ Omitted
Thread ID	✅ Included	❌ Omitted

Performance Optimization Flag (Optional)

To reduce runtime overhead (e.g., skipping unnecessary NaN/Inf checks), set the following environment variable:

export TORCH_RBLN_DEPLOY=ON

This enables lightweight execution for deployment scenarios.

Device Mapping Configuration

By default, each physical NPU device is mapped to a logical device with a 1:1 relationship (equivalent to RBLN_NPUS_PER_DEVICE=1). This is called Direct Mapping and provides the standard PyTorch device usage experience.

You can configure device mapping using the following environment variables to enable Aggregated Mapping, which groups multiple physical NPUs into a single logical device for RSD (Rebellions Scalable Design) functionality.

RBLN_NPUS_PER_DEVICE

Groups physical NPUs together to create logical devices. Each logical device will contain the specified number of physical NPUs. This is designed for Normal Users who want simple configuration.

Constraint: Must be one of the supported sizes: 1, 2, 4, 8, 16, or 32. These values match the base_sizes defined in rebel/core/compilation/_impl.py for production environments.

export RBLN_NPUS_PER_DEVICE=2

Examples:

With 4 physical devices (RBLN_DEVICES=0,1,2,3 or default):

RBLN_NPUS_PER_DEVICE=2 → rbln:0 maps to NPUs [0, 1], rbln:1 maps to NPUs [2, 3]
RBLN_NPUS_PER_DEVICE=4 → rbln:0 maps to NPUs [0, 1, 2, 3] (full aggregation)

With 6 physical devices and RBLN_NPUS_PER_DEVICE=4:

rbln:0 maps to NPUs [0, 1, 2, 3]
NPUs [4, 5] remain unused (warning will be displayed)

RBLN_DEVICE_MAP

Provides explicit mapping between logical devices and physical NPU IDs. This is designed for Advanced Users who need fine-grained control over device topology.

Constraint: Each device group must contain one of the supported sizes: 1, 2, 4, 8, 16, or 32 devices.

export RBLN_DEVICE_MAP="[0,1],[2,3,4,5]"

Format: Comma-separated groups of NPU IDs, each group enclosed in square brackets.

Example: With 6 physical devices:

RBLN_DEVICE_MAP="[0,1],[2,3,4,5]" → rbln:0 maps to NPUs [0, 1], rbln:1 maps to NPUs [2, 3, 4, 5]

Configuration Priority and Conflict Resolution

Priority order: RBLN_DEVICE_MAP > RBLN_NPUS_PER_DEVICE > default (1:1 mapping)

Viewing Device Topology

You can view the current device topology using torch.rbln.device_summary():

import torch_rbln
torch.rbln.device_summary()

Example output:

[RBLN] Device Topology Initialized:
+-------------------+-------------------+----------------------+
| Logical Device    | Physical NPU IDs  | Status               |
+-------------------+-------------------+----------------------+
| rbln:0            | [ 0, 1 ]          | Active (Aggregated)  |
| rbln:1            | [ 2, 3 ]          | Active (Aggregated)  |
+-------------------+-------------------+----------------------+

Tensor Parallel Configuration

The following environment variables control tensor parallel behavior for torch.compile operations and eager mode ops.

TORCH_RBLN_USE_TP_FAILOVER

Enables automatic tensor parallel failover. When a RuntimeError occurs during execution with tensor_parallel_size > 1, the system automatically retries with tp_size=1 on the root NPU of the device group.

This is useful for models that don't support tensor parallelism, allowing them to run on a single NPU within an aggregated device group without manual intervention.

export TORCH_RBLN_USE_TP_FAILOVER=ON   # enable
export TORCH_RBLN_USE_TP_FAILOVER=OFF  # disable (default: OFF)

Behavior:

When set to ON and a RuntimeError occurs with tp > 1:
1. The system logs a warning message indicating the failover attempt
2. The model is recompiled with tensor_parallel_size=1
3. Execution continues on the root NPU of the device group
When set to OFF or unset (default), RuntimeErrors are propagated as-is

Example scenario: With RBLN_NPUS_PER_DEVICE=4 (4 NPUs per logical device):

Initial compilation attempts tp=4
If the model doesn't support TP, a RuntimeError occurs
With failover enabled, the system retries with tp=1 on NPU 0

TORCH_RBLN_USE_DEVICE_TP

Controls whether eager mode operations use the device group's tensor parallel size instead of tp_size=1.

By default, eager mode ops (operations outside of torch.compile) use tp_size=1. When this environment variable is set to ON, eager mode ops will follow the logical device size defined by RBLN_NPUS_PER_DEVICE or RBLN_DEVICE_MAP, matching the behavior of torch.compile operations.

export TORCH_RBLN_USE_DEVICE_TP=ON   # use device group tp size
export TORCH_RBLN_USE_DEVICE_TP=OFF  # use tp_size=1 for eager ops (default: OFF)

Behavior:

When set to ON: Eager mode ops use the device group's tensor parallel size (e.g., tp=4 with RBLN_NPUS_PER_DEVICE=4)
When set to OFF or unset (default): Eager mode ops use tp_size=1

Use case: This is useful when you want consistent tensor parallel behavior across both eager and compiled operations, particularly in mixed execution scenarios.

Install Wheel Package (Optional)

If you want to make *.whl and install that, run below command.

poetry build
pip install ./dist/torch_rbln*.whl

When you change C++ or Python source code, you just run Install Editable Package or Install Wheel Package again.

Apply Custom `rebel-compiler`

You have 2 choices:

Use built-in one
Use external one

Use `torch-rbln` built-in `rebel-compiler` (`torch`: Python package, `rebel-compiler`: `third_party/rebel_compiler`)

This way is strongly recommended. Those are same with Getting Started.

git submodule update --init ./
python3 -m venv ./.venv && source ./.venv/bin/activate
pip3 install poetry==2.0.1
export LDAP_USERNAME=daekyeong.kim     # Put your username
export LDAP_PASSWORD=mysecretpassword  # Put your password
poetry config http-basic.rbln $LDAP_USERNAME $LDAP_PASSWORD

Without poetry sync, checkout rebel-compiler where ./third_party/rebel_compiler to your custom branch.

pushd ./third_party/rebel_compiler
  git checkout my_custom_branch
popd

It will make a package and install into your environment with syncing.

./tools/apply-custom-rebel.sh

Above script edits pyproject.toml and poetry.lock files. If you want to apply custom rebel-compiler temporarily, keep your eyes on those files.

(Optional) You can choose build type like below.

RBLN_BUILD_TYPE=Debug ./tools/apply-custom-rebel.sh

Then, you can build or install torch-rbln package on the custom rebel-compiler package.

poetry install --only-root

Use external `rebel-compiler` (for rebel-compiler developers)

Prereqs

You’ve already built rebel-compiler.
${REBEL_HOME} points to the rebel-compiler repo root.

Method 1: Automated Script (Recommended)

⚠️ Warning: Do not use build-with-external-rebel.sh together with apply-custom-rebel.sh. Both scripts modify pyproject.toml and may cause environment conflicts. Use only one method at a time.

Use the build-with-external-rebel.sh script for automated build:

gcc-13 mode (default): Uses PyTorch from PyPI

cd /path/to/torch-rbln
export REBEL_HOME=/path/to/rebel_compiler
./tools/build-with-external-rebel.sh --clean

gcc-12 mode: Requires pre-built torch wheel

cd /path/to/torch-rbln
export REBEL_HOME=/path/to/rebel_compiler
export RBLN_GCC_VERSION=12
export TORCH_WHEEL_PATH=/path/to/torch-2.8.0-cp310-cp310-linux_x86_64.whl
./tools/build-with-external-rebel.sh --clean

Options:

--clean: Clean build artifacts before building
--clean-only: Only clean build artifacts, do not build

Environment Variables:

REBEL_HOME: Path to rebel-compiler (REQUIRED)
RBLN_GCC_VERSION: GCC version to use (12 or 13, default: 13)
TORCH_WHEEL_PATH: Path to pre-built torch wheel (REQUIRED for gcc-12, ignored for gcc-13)
RBLN_BUILD_TYPE: Build type (Release or Debug, default: Release)
RBLN_VENV_PATH: Virtual environment path (default: .venv-rebel)

The script will:

Check Python version compatibility with rebel-compiler
Create virtual environment
Install dependencies
Configure pyproject.toml for external rebel-compiler
Build and install torch-rbln
Verify installation with import tests

After build:

source .venv-rebel/bin/activate
# activate_rebel is auto-sourced, setting REBEL_HOME, PYTHONPATH, LD_LIBRARY_PATH
python -c "import torch; import rebel; import torch_rbln; print('OK')"

Method 2: Manual Setup

1) Create and activate a virtualenv

python3 -m venv .venv
source .venv/bin/activate

2) Add your local `rebel-compiler` in editable mode

poetry add --editable "${REBEL_HOME}/python"

3) Install this project, using the external compiler

RBLN_USE_EXTERNAL_REBEL_COMPILER=1 poetry install --only-root

Create Git Commit

Git pre-commit hook is working. So, when you create Git commit, linting would be triggered. For prepare linting, you MUST initialize lintrunner.

source ./.venv/bin/activate

lintrunner init

Once lintrunner initalized, no need to initialize again. You can commit now.

git commit

Some failures can be fixed automatically. Run below command for auto fixing.

lintrunner -m main -a

Run Tests

Assume that you are in Python virtual environment, and install torch-rbln package successfully.

C++ Tests

Making package runs in new isolated environment. Although you build your C++ project using poetry install --only-root, can't find that directory. So, for CTest you MUST build C++ project manually.

./tools/build-libtorch-rbln.sh

ctest --test-dir ./build

Python Tests

pytest ./test

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.2.0

Apr 30, 2026

0.1.8

Mar 31, 2026

This version

0.1.7

Apr 7, 2026

0.1.6

Apr 7, 2026

0.1.5

Apr 7, 2026

0.1.4

Apr 7, 2026

0.1.3

Apr 7, 2026

0.1.2

Apr 7, 2026

0.1.1

Apr 7, 2026

0.1.0

Apr 7, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

torch_rbln-0.1.7-cp313-cp313-manylinux_2_34_x86_64.whl (704.5 kB view details)

Uploaded Apr 7, 2026 CPython 3.13manylinux: glibc 2.34+ x86-64

torch_rbln-0.1.7-cp312-cp312-manylinux_2_34_x86_64.whl (703.2 kB view details)

Uploaded Apr 7, 2026 CPython 3.12manylinux: glibc 2.34+ x86-64

torch_rbln-0.1.7-cp311-cp311-manylinux_2_34_x86_64.whl (701.0 kB view details)

Uploaded Apr 7, 2026 CPython 3.11manylinux: glibc 2.34+ x86-64

torch_rbln-0.1.7-cp310-cp310-manylinux_2_34_x86_64.whl (700.3 kB view details)

Uploaded Apr 7, 2026 CPython 3.10manylinux: glibc 2.34+ x86-64

File details

Details for the file torch_rbln-0.1.7-cp313-cp313-manylinux_2_34_x86_64.whl.

File metadata

Download URL: torch_rbln-0.1.7-cp313-cp313-manylinux_2_34_x86_64.whl
Upload date: Apr 7, 2026
Size: 704.5 kB
Tags: CPython 3.13, manylinux: glibc 2.34+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for torch_rbln-0.1.7-cp313-cp313-manylinux_2_34_x86_64.whl
Algorithm	Hash digest
SHA256	`245e49a439903d8d5c5014323748169afd97dd5beef5ca925e466d9ae4d052bf`
MD5	`414c4be5c952242e37b9f4e08e8d5384`
BLAKE2b-256	`407a0ca9b01a921e6133d69960b67d13e391ef6daedac9fd15954f3c2179a6d7`

See more details on using hashes here.

File details

Details for the file torch_rbln-0.1.7-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

Download URL: torch_rbln-0.1.7-cp312-cp312-manylinux_2_34_x86_64.whl
Upload date: Apr 7, 2026
Size: 703.2 kB
Tags: CPython 3.12, manylinux: glibc 2.34+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for torch_rbln-0.1.7-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm	Hash digest
SHA256	`2ea211d2d1b5f56297f77cbeb834381cb7c5012861eca3f37a6ea2f9bbb4c76a`
MD5	`6fa6ac8d7fe6b343882a5dac1d447e23`
BLAKE2b-256	`257691b698aca639695924a5bd07a49778e76519acf8d24b5c7ffb39a548d642`

See more details on using hashes here.

File details

Details for the file torch_rbln-0.1.7-cp311-cp311-manylinux_2_34_x86_64.whl.

File metadata

Download URL: torch_rbln-0.1.7-cp311-cp311-manylinux_2_34_x86_64.whl
Upload date: Apr 7, 2026
Size: 701.0 kB
Tags: CPython 3.11, manylinux: glibc 2.34+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for torch_rbln-0.1.7-cp311-cp311-manylinux_2_34_x86_64.whl
Algorithm	Hash digest
SHA256	`63d1e8b553e09f271a9d2038639965ba7995a49db40f2afc92b2dd99baafda88`
MD5	`dc2f6ed7f35ad931309857665795c7de`
BLAKE2b-256	`deb87acbe7b6da35d0d49716580c93dc6cea990cc392df44b5a9d940df260df9`

See more details on using hashes here.

File details

Details for the file torch_rbln-0.1.7-cp310-cp310-manylinux_2_34_x86_64.whl.

File metadata

Download URL: torch_rbln-0.1.7-cp310-cp310-manylinux_2_34_x86_64.whl
Upload date: Apr 7, 2026
Size: 700.3 kB
Tags: CPython 3.10, manylinux: glibc 2.34+ x86-64
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.10.12

File hashes

Hashes for torch_rbln-0.1.7-cp310-cp310-manylinux_2_34_x86_64.whl
Algorithm	Hash digest
SHA256	`a4f0c8ac8214eff2bdf53a9a7d36565aa5691192d86891de677223522354bfa7`
MD5	`653be58b3fb552adbd366308ee7dfb0b`
BLAKE2b-256	`2970052905a00b93832612aeeb07bcd12ef5f17f00976c0f50ff509cbd5a24ca`

See more details on using hashes here.

torch-rbln 0.1.7

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

PyTorch for Rebellions' NPU

Getting Started (torch: Python package, rebel-compiler: Python package)

Prerequisites

Update Git submodules

Create Python Virtual Environment

Install Dependencies

Choose Build Type (Optional)

Install Editable Package

Logging

Environment Variables

Log Levels

Debug vs Release Builds

Performance Optimization Flag (Optional)

Device Mapping Configuration

RBLN_NPUS_PER_DEVICE

RBLN_DEVICE_MAP

Configuration Priority and Conflict Resolution

Viewing Device Topology

Tensor Parallel Configuration

TORCH_RBLN_USE_TP_FAILOVER

TORCH_RBLN_USE_DEVICE_TP

Install Wheel Package (Optional)

Apply Custom rebel-compiler

Use torch-rbln built-in rebel-compiler (torch: Python package, rebel-compiler: third_party/rebel_compiler)

Use external rebel-compiler (for rebel-compiler developers)

Method 1: Automated Script (Recommended)

Method 2: Manual Setup

1) Create and activate a virtualenv

2) Add your local rebel-compiler in editable mode

3) Install this project, using the external compiler

Create Git Commit

Run Tests

C++ Tests

Python Tests

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distributions

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

File details

File metadata

File hashes

Getting Started (`torch`: Python package, `rebel-compiler`: Python package)

Apply Custom `rebel-compiler`

Use `torch-rbln` built-in `rebel-compiler` (`torch`: Python package, `rebel-compiler`: `third_party/rebel_compiler`)

Use external `rebel-compiler` (for rebel-compiler developers)

2) Add your local `rebel-compiler` in editable mode