Remote Executable eXecution — inference with remotely-stored model weights

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

Rex Framework Package Overview

Rex Framework enables inference with remotely stored model weights without downloading full model checkpoints to local storage. Only the chunks needed for a given inference pass are fetched; the full model never resides in local memory or on disk.

Package intent:

Primary: enable end users to run Rex for conversion, serving, and inference workflows.
Secondary: support validation-oriented usage in CI and application test environments.

This package is intended for:

Cloud-first inference where model chunks are fetched on demand.
Memory-bounded environments where full checkpoint residency is undesirable.
Notebook workflows, including Kaggle and Google Colab.

What You Get In This Package

Python API for loading Rex manifests and running inference.
CLI tools for conversion, validation, inspection, serving, benchmarking, and demo runs.
Optional extras for PyTorch, cloud storage integrations, and benchmark tooling.
WebSocket streaming is built into rex-serve and the runtime; it is not a separate install-time extra.
The package is designed for runtime use and ships the support files needed for validation-oriented installs.

Install

Minimal package (no PyTorch, useful for manifest validation and storage testing):

pip install rex-framework

Recommended for real inference workloads:

pip install "rex-framework[pytorch]"

With all optional features (cloud storage backends, benchmarking tools):

pip install "rex-framework[all]"

Available extras:

Extra	What it adds
`pytorch`	`torch>=2.0.0` for inference
`google-drive`	Google Drive storage backend
`onedrive`	OneDrive storage backend
`bench`	Benchmarking and profiling tools
`all`	All of the above

Python compatibility notes:

numpy is auto-installed with the package (no separate install needed).
The pytorch extra supports Python 3.10 to 3.13.
PyTorch wheels are not published on PyPI for Python 3.14.
On some platforms (for example, macOS x86_64), Python 3.13 may still lack compatible torch wheels. Use Python 3.11 in that case.

Verify your install:

python -c "import torch, rex; print(torch.__version__); print(rex.__version__)"

How Rex Finds Your Model: Manifest and Chunk Paths

Rex does not load a model from a single checkpoint file. Instead it reads a manifest (a JSON file describing chunk locations, hashes, and metadata) and fetches individual chunks on demand from a base URL. Understanding how to point Rex at your chunk host is essential for it to work.

What a Rex manifest is

A manifest is a JSON file (manifest.json) generated by rex-convert. It contains:

Model metadata (architecture, dtype, total size).
A list of chunks: each chunk has a resource_id, a byte_range expressed as ByteRange(start, end), and a chunk_id.
The expected base URL where chunks are hosted, usually in provenance.base_url.
An optional storage_backend hint such as http_range or streaming_ws.

Chunks are served separately (e.g., as files in a directory) and fetched via HTTP Range requests. You do not need a special streaming server for the default path, but you do need an HTTP endpoint that supports Range headers (nginx, S3, Cloudflare R2, or rex-serve). If storage_backend is streaming_ws, Rex uses the WebSocket live streaming backend instead of HTTP Range.

Step 1 — Convert your model to Rex format

rex-convert /path/to/model.pt \
  --output ./rex_output \
  --framework pytorch \
  --model-id my-model

This produces:

rex_output/
  manifest.json        ← the manifest you will point load_model at
  weights/
    chunk_000.bin
    chunk_001.bin
    ...

Step 2 — Host the chunk files

Option A — local HTTP server (for testing):

rex-serve --dir ./rex_output/weights --port 8080

Chunks are now reachable at http://localhost:8080/chunk_000.bin, etc.

Option B — any static HTTP host:

Upload rex_output/weights/ to any static host (nginx, S3, Cloudflare R2, GitHub Releases, Google Drive folder with public sharing). Note the base URL.

Step 3 — Point `load_model` at the manifest

import rex
from rex.api.config import RexConfig

config = RexConfig()
config.storage.base_url = "http://localhost:8080"   # chunk host

runtime = rex.load_model("./rex_output/manifest.json", config=config)

base_url tells Rex where to fetch chunks from. Every chunk path in manifest.json is appended to base_url when fetching. If the manifest includes provenance.base_url, Rex can use that value too. If the manifest advertises storage_backend: streaming_ws, Rex opens the live WebSocket stream during model load.

Remote manifest (manifest itself is also hosted):

config.storage.base_url = "https://my-host.example.com/weights"
runtime = rex.load_model("https://my-host.example.com/weights/manifest.json", config=config)

Environment variable alternative:

export REX_STORAGE_URL=https://my-host.example.com/weights
python your_script.py

Step 4 — Run inference

import rex
import numpy as np
from rex.api.config import RexConfig

input_data = np.random.randn(1, 768).astype(np.float32)
config = RexConfig()
config.storage.base_url = "http://localhost:8080"

output, metrics = rex.run_inference_sync(
    "./rex_output/manifest.json",
    input_data,
    config=config,
)
print(f"Inference time: {metrics.total_time_ms:.1f} ms")

Storage Backends

Rex supports multiple storage backends. Set config.storage.base_url to the appropriate URL scheme:

Backend	URL format	Extra required
Local HTTP / `rex-serve`	`http://localhost:8080`	none
Remote HTTP/HTTPS	`https://example.com/weights`	none
WebSocket streaming (`STREAMING_WS`)	`wss://example.com` or `http://localhost:8080`	`backend_type=STREAMING_WS`
Google Drive	`gdrive://folder-id`	`google-drive`
OneDrive	`onedrive://drive-id/path`	`onedrive`
iCloud	`icloud://path/to/weights`	none

For constructor-based overrides, RexConfig(storage__backend_type=StorageBackendType.STREAMING_WS, storage__base_url="wss://...") also works. If the manifest includes storage_backend="streaming_ws", Rex will prefer the WebSocket backend automatically.

Authenticated endpoints (e.g., private S3 or token-gated APIs):

config.storage.auth_token = "Bearer YOUR_TOKEN"

Or via environment variable:

export REX_AUTH_TOKEN=Bearer YOUR_TOKEN

Notebook Usage — Kaggle

Kaggle notebooks run on isolated kernels with internet access. The recommended pattern is to convert your model beforehand, host the chunks somewhere reachable (HTTPS URL, Google Drive public folder, or a Kaggle Dataset), then install Rex and load from that URL.

Install in a Kaggle cell

# Cell 1 — install
!pip install "rex-framework[pytorch]" -q
import rex, torch
print(rex.__version__, torch.__version__)

Load from an HTTPS host

# Cell 2 — configure and load
import rex
from rex.api.config import RexConfig

MANIFEST_URL = "https://your-static-host.com/rex_output/manifest.json"
CHUNKS_BASE_URL = "https://your-static-host.com/rex_output/weights"

config = RexConfig()
config.storage.base_url = CHUNKS_BASE_URL
config.cache.max_memory_cache_bytes = 512 * 1024 * 1024  # 512 MB limit

runtime = rex.load_model(MANIFEST_URL, config=config)

Load from a Kaggle Dataset

Upload your rex_output/ directory as a Kaggle Dataset. Kaggle mounts datasets at /kaggle/input/<dataset-name>/.

# Cell 2 — load from Kaggle Dataset mount
!rex-serve --dir /kaggle/input/my-rex-model/weights --port 8080 &

import rex
from rex.api.config import RexConfig

MANIFEST_PATH = "/kaggle/input/my-rex-model/manifest.json"
CHUNKS_BASE_URL = "http://127.0.0.1:8080"

config = RexConfig()
config.storage.base_url = CHUNKS_BASE_URL

runtime = rex.load_model(MANIFEST_PATH, config=config)

Add Kaggle Secrets for authenticated endpoints

from kaggle_secrets import UserSecretsClient

secrets = UserSecretsClient()
token = secrets.get_secret("REX_AUTH_TOKEN")

config.storage.auth_token = f"Bearer {token}"

Run inference on Kaggle

# Cell 3 — inference
import numpy as np

input_data = np.random.randn(1, 768).astype(np.float32)
output, metrics = rex.run_inference_sync(MANIFEST_PATH, input_data, config=config)
print(f"Output shape: {output.shape}")
print(f"Inference time: {metrics.total_time_ms:.1f} ms")

Notebook Usage — Google Colab

Google Colab provides a transient VM with internet access. The same manifest/chunk remote loading pattern applies. Colab T4 or A100 GPUs can be used if your Rex model targets CUDA.

Install in Colab

# Cell 1 — install
!pip install "rex-framework[pytorch]" -q
import rex, torch
print(rex.__version__, torch.__version__)

Load from an HTTPS host

# Cell 2 — configure and load
import rex
from rex.api.config import RexConfig

MANIFEST_URL = "https://your-static-host.com/rex_output/manifest.json"
CHUNKS_BASE_URL = "https://your-static-host.com/rex_output/weights"

config = RexConfig()
config.storage.base_url = CHUNKS_BASE_URL
config.cache.max_memory_cache_bytes = 1 * 1024 * 1024 * 1024  # 1 GB (Colab has more RAM)
config.scheduler.enable_prefetch = True
config.scheduler.prefetch_window = 4

runtime = rex.load_model(MANIFEST_URL, config=config)

Load from Google Drive in Colab

If you uploaded your rex_output/ to your Google Drive, mount it and serve the chunk directory locally:

# Cell 2a — mount Google Drive
from google.colab import drive
drive.mount("/content/drive")

# Cell 2b — serve the mounted directory locally
!rex-serve --dir /content/drive/MyDrive/rex_output/weights --port 8080 &

import rex
from rex.api.config import RexConfig

MANIFEST_PATH = "/content/drive/MyDrive/rex_output/manifest.json"
CHUNKS_BASE_URL = "http://127.0.0.1:8080"

config = RexConfig()
config.storage.base_url = CHUNKS_BASE_URL

runtime = rex.load_model(MANIFEST_PATH, config=config)

Use Colab Secrets for tokens

from google.colab import userdata

config.storage.auth_token = f"Bearer {userdata.get('REX_AUTH_TOKEN')}"

GPU inference in Colab

Rex will use the available CUDA device automatically when PyTorch detects a GPU. Confirm your runtime type is set to T4 GPU or A100 in Colab's Runtime menu.

import torch
print("CUDA available:", torch.cuda.is_available())
print("Device:", torch.cuda.get_device_name(0) if torch.cuda.is_available() else "CPU")

Core Principle

Rex executes with bounded local residency by streaming model chunks from remote storage through HTTP range fetches and cache-aware scheduling. At no point does the full model need to exist locally.

Feature Controls (Quick Reference)

Control Rex behaviour through RexConfig:

from rex.api.config import RexConfig

config = RexConfig()

# How much local memory the cache can use
config.cache.max_memory_cache_bytes = 512 * 1024 * 1024

# Fraction of the full model allowed locally at any time (Rex invariant)
config.cache.max_local_fraction_of_model = 0.4

# Cache eviction policy: lru | lfu | weighted_utility
config.cache.policy = "weighted_utility"

# Prefetch ahead of current execution
config.scheduler.enable_prefetch = True
config.scheduler.prefetch_window = 4

# Execution planning mode: graph | sequential
config.scheduler.scheduler_mode = "graph"

# Storage concurrency
config.storage.max_concurrent_fetches = 4
config.storage.adaptive_concurrency = True

# Logging
config.observability.log_level = "INFO"   # DEBUG | INFO | WARNING | ERROR
config.observability.log_format = "console"  # console | json | quiet

For all available config fields and preset profiles (debug, throughput-oriented), see:

https://rotsl.github.io/rex-framework/

CLI Quick Reference

Command	Purpose
`rex-convert`	Convert a PyTorch checkpoint to Rex format
`rex-serve`	Serve chunk files with HTTP Range support
`rex-validate`	Validate a manifest file
`rex-inspect`	Inspect a manifest (verbose chunk listing)
`rex-benchmark`	Run latency/throughput benchmark
`rex-run-demo`	End-to-end demo run

Package Guide

For full package documentation, including notebook workflows and live streaming guidance, see:

https://rotsl.github.io/rex-framework/

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

rotsl

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.8

Apr 26, 2026

0.1.7

Apr 26, 2026

0.1.6

Apr 23, 2026

0.1.5

Apr 23, 2026

0.1.4

Apr 16, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rex_framework-0.1.8.tar.gz (264.9 kB view details)

Uploaded Apr 26, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

rex_framework-0.1.8-py3-none-any.whl (186.8 kB view details)

Uploaded Apr 26, 2026 Python 3

File details

Details for the file rex_framework-0.1.8.tar.gz.

File metadata

Download URL: rex_framework-0.1.8.tar.gz
Upload date: Apr 26, 2026
Size: 264.9 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rex_framework-0.1.8.tar.gz
Algorithm	Hash digest
SHA256	`d9ae3d7a54c93833a2a0f9a571018b7f06a11c9eca7fea26250990bbf094bf32`
MD5	`de3fa8cb07d99023bc56900f1bfc54d5`
BLAKE2b-256	`aec2bb57b8011b962f0648d30a4305e1d8e3eae7b1fa71442a0fb4ef9fcb28dc`

See more details on using hashes here.

Provenance

The following attestation bundles were made for rex_framework-0.1.8.tar.gz:

Publisher: pypi-publish.yml on rotsl/rex-framework

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: rex_framework-0.1.8.tar.gz
- Subject digest: d9ae3d7a54c93833a2a0f9a571018b7f06a11c9eca7fea26250990bbf094bf32
- Sigstore transparency entry: 1391648336
- Sigstore integration time: Apr 26, 2026
Source repository:
- Permalink: rotsl/rex-framework@67bfadb075a164eed3caf7cfd6dc8c901af58bc5
- Branch / Tag: refs/tags/v0.1.8
- Owner: https://github.com/rotsl
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: pypi-publish.yml@67bfadb075a164eed3caf7cfd6dc8c901af58bc5
- Trigger Event: release

File details

Details for the file rex_framework-0.1.8-py3-none-any.whl.

File metadata

Download URL: rex_framework-0.1.8-py3-none-any.whl
Upload date: Apr 26, 2026
Size: 186.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for rex_framework-0.1.8-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8d2a4439209f0a6b6743198b9402f11ca80f050d84d35a74cdb293347cdedbdf`
MD5	`be6ba0040baae810d2ff4d1486679eb9`
BLAKE2b-256	`c32a6a5b2451c576d5c5f2db6b22b9d193768f5daeca4a6a4948300ef6911426`

See more details on using hashes here.

Provenance

The following attestation bundles were made for rex_framework-0.1.8-py3-none-any.whl:

Publisher: pypi-publish.yml on rotsl/rex-framework

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: rex_framework-0.1.8-py3-none-any.whl
- Subject digest: 8d2a4439209f0a6b6743198b9402f11ca80f050d84d35a74cdb293347cdedbdf
- Sigstore transparency entry: 1391648339
- Sigstore integration time: Apr 26, 2026
Source repository:
- Permalink: rotsl/rex-framework@67bfadb075a164eed3caf7cfd6dc8c901af58bc5
- Branch / Tag: refs/tags/v0.1.8
- Owner: https://github.com/rotsl
- Access: private
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: pypi-publish.yml@67bfadb075a164eed3caf7cfd6dc8c901af58bc5
- Trigger Event: release

rex-framework 0.1.8

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

Rex Framework Package Overview

What You Get In This Package

Install

How Rex Finds Your Model: Manifest and Chunk Paths

What a Rex manifest is

Step 1 — Convert your model to Rex format

Step 2 — Host the chunk files

Step 3 — Point load_model at the manifest

Step 4 — Run inference

Storage Backends

Notebook Usage — Kaggle

Install in a Kaggle cell

Load from an HTTPS host

Load from a Kaggle Dataset

Add Kaggle Secrets for authenticated endpoints

Run inference on Kaggle

Notebook Usage — Google Colab

Install in Colab

Load from an HTTPS host

Load from Google Drive in Colab

Use Colab Secrets for tokens

GPU inference in Colab

Core Principle

Feature Controls (Quick Reference)

CLI Quick Reference

Package Guide

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

Step 3 — Point `load_model` at the manifest