Skip to main content

Minimal RL in Rust

Project description

TwisteRL

TwisteRL

A minimalistic, high-performance Reinforcement Learning framework implemented in Rust.

The current version is a Proof of Concept, stay tuned for future releases!

Install

pip install .

Use

Training

python -m twisterl.train --config examples/ppo_puzzle8_v1.json

This example trains a model to play the popular "8 puzzle":

|8|7|5|
|3|2| |
|4|6|1|

where numbers have to be shifted around through the empty slot until they are in order.

This model can be trained on a single CPU in under 1 minute (no GPU required!). A larger version (4x4) is available: examples/ppo_puzzle15_v1.json.

Inference

Check the notebook example here!

Creating your own environment

The examples/grid_world custom environment example here shows how to implement an environment in Rust and expose it to Python with PyO3. You can use it as a template:

  1. Create a new crate

    cargo new --lib examples/my_env
    
  2. Add dependencies in examples/my_env/Cargo.toml:

    [package]
    name = "my_env"
    version = "0.1.0"
    edition = "2021"
    
    [lib]
    name = "my_env"
    crate-type = ["cdylib"]
    
    [dependencies]
    pyo3 = { version = "0.20", features = ["extension-module"] }
    twisterl = { path = "path/to/twisterl/rust", features = ["python_bindings"] }
    # Or using the official crate:
    # twisterl = { version = "a.b.c", features = ["python_bindings"] }
    
  3. Implement the environment by defining a struct and implementing twisterl::rl::env::Env for it. Provide logic for reset, step, observe, reward, etc.

In inference, twisterRL algorithms track the actions applied to the environment externally. If you need the environment itself to track them, implement the track_solution and solution methods in the Env trait.

  1. Expose it to Python using PyBaseEnv:

    use pyo3::prelude::*;
    use twisterl::python_interface::env::PyBaseEnv;
    
    #[pyclass(name = "MyEnv", extends = PyBaseEnv)]
    struct PyMyEnv;
    
    #[pymethods]
    impl PyMyEnv {
        #[new]
        fn new(...) -> (Self, PyBaseEnv) {
            let env = MyEnv::new(...);
            (PyMyEnv, PyBaseEnv { env: Box::new(env) })
        }
    }
    
  2. Add a pyproject.toml describing the Python package so maturin can build a wheel.

  3. Build and install the module:

    pip install .
    
  4. Use it from Python:

    import my_env
    env = my_env.MyEnv(...)
    obs = env.reset()
    

Refer to grid_world for a complete working example.

Checkpoint Format

TwisteRL uses safetensors as the default checkpoint format for model weights. Safetensors provides:

  • Security: No arbitrary code execution (unlike pickle-based .pt files)
  • Speed: Zero-copy loading for faster model initialization
  • HuggingFace compatibility: Standard format for Hub models

Legacy .pt checkpoints are still supported for backward compatibility but will log a warning. To convert existing checkpoints:

from twisterl.utils import convert_pt_to_safetensors

convert_pt_to_safetensors("model.pt")  # Creates model.safetensors

Documentation

🚀 Key Features

  • High-Performance Core: RL episode loop implemented in Rust for faster training and inference
  • Inference-Ready: Easy compilation and bundling of models with environments into portable binaries for inference
  • Modular Design: Support for multiple algorithms (PPO, AlphaZero) with interchangeable training and inference
  • Language Interoperability: Core in Rust with Python interface
  • Symmetry-Aware Training via Twists: Environments can expose observation/action permutations (“twists”) so policies automatically exploit device or puzzle symmetries for faster learning.

🏗️ Current State (PoC)

  • Hybrid rust-python implementation:
    • Data collection and inference in Rust
    • Training in Python (PyTorch)
  • Supported algorithms:
    • PPO (Proximal Policy Optimization)
    • AlphaZero
  • Focus on discrete observation and action spaces
  • Support for native Rust environments and for Python environments through a wrapper

🚧 Roadmap

Upcoming Features (Alpha Version)

  • Full training in Rust
  • Extended support for:
    • Continuous observation spaces
    • Continuous action spaces
    • Custom policy architectures
  • Native WebAssembly environment support
  • Streamlined policy+environment bundle export to WebAssembly
  • Comprehensive Python interface
  • Enhanced documentation and test coverage

💎 Future Possibilities

  • WebAssembly environment repository
  • Browser-based environment and agent visualization
  • Interactive web demonstrations
  • Serverless distributed training

🎮 Use Cases

Currently used in:

Perfect for:

  • Puzzle-like optimization problems
  • Any scenario requiring fast, production performance RL inference

🔧 Current Limitations

  • Limited to discrete observation and action spaces
  • Python environments may create performance bottlenecks
  • Documentation and testing coverage is currently minimal
  • WebAssembly support is in development

🤝 Contributing

We're in early development stages and welcome contributions! Stay tuned for more detailed contribution guidelines.

📄 Note

This project is currently in PoC stage. While functional, it's under active development and the API may change significantly.

📜 License

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

twisterl-0.4.0-cp312-cp312-win_amd64.whl (318.1 kB view details)

Uploaded CPython 3.12Windows x86-64

twisterl-0.4.0-cp312-cp312-manylinux_2_34_x86_64.whl (456.6 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

twisterl-0.4.0-cp312-cp312-macosx_11_0_arm64.whl (412.0 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

twisterl-0.4.0-cp311-cp311-win_amd64.whl (318.2 kB view details)

Uploaded CPython 3.11Windows x86-64

twisterl-0.4.0-cp311-cp311-manylinux_2_34_x86_64.whl (456.0 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.34+ x86-64

twisterl-0.4.0-cp311-cp311-macosx_11_0_arm64.whl (412.5 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

twisterl-0.4.0-cp310-cp310-win_amd64.whl (318.1 kB view details)

Uploaded CPython 3.10Windows x86-64

twisterl-0.4.0-cp310-cp310-manylinux_2_34_x86_64.whl (455.9 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.34+ x86-64

twisterl-0.4.0-cp310-cp310-macosx_11_0_arm64.whl (412.5 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

twisterl-0.4.0-cp39-cp39-win_amd64.whl (319.7 kB view details)

Uploaded CPython 3.9Windows x86-64

twisterl-0.4.0-cp39-cp39-manylinux_2_34_x86_64.whl (457.2 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.34+ x86-64

twisterl-0.4.0-cp39-cp39-macosx_11_0_arm64.whl (414.8 kB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

File details

Details for the file twisterl-0.4.0-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: twisterl-0.4.0-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 318.1 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for twisterl-0.4.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 03c0ee16372fbb0e53dc537a21cd599a8545a83733e5b12f14e192c8b65ab881
MD5 780929001316bbb3631e8ab3ac72a762
BLAKE2b-256 157acdbf6b1f3a9a95d48a727fb0510d48ac4da9edb809639fedf26aa111d671

See more details on using hashes here.

File details

Details for the file twisterl-0.4.0-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for twisterl-0.4.0-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 a366ec25a6accf81a1014f6b9d66b6b2edce8f53c58a552e18abdc2615e618ff
MD5 95411994a450dcd344470751aecf5085
BLAKE2b-256 12b2bdf3e7b1461ae9b7c82243e748fe3a3beb6687cabd2d8bf1da3f390be00e

See more details on using hashes here.

File details

Details for the file twisterl-0.4.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for twisterl-0.4.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 fc0aad61dd0af6e1d3af627c44dc36493ab8c90197b95f577da2fe52145d976b
MD5 0436c70371c941adb436d75f9a5c1aa1
BLAKE2b-256 20b9faf866fbef6e10e162037c20e75d9e5e07dc46282b092ef3981d1c81b9c9

See more details on using hashes here.

File details

Details for the file twisterl-0.4.0-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: twisterl-0.4.0-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 318.2 kB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for twisterl-0.4.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 10d61796d16333163fc83298bbcaf94e61711c658f3d53b608d7a3487581ed6e
MD5 fcc910359cc27d7fbbeb228f8b7dd049
BLAKE2b-256 e13ddfd0927926b1da3d91c6e46a1e636c485a1342752cdda383bcc7a31926ff

See more details on using hashes here.

File details

Details for the file twisterl-0.4.0-cp311-cp311-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for twisterl-0.4.0-cp311-cp311-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 c62681474f847b37ea04f1654d0615df8172b578560108e27d67a08ca9185ce8
MD5 812e7c13f6626559a61a6299d6cc9255
BLAKE2b-256 682ecb66ab384f6c3f0c97ebdeebfae1d18cf3a0e235f671692d39effd7c7489

See more details on using hashes here.

File details

Details for the file twisterl-0.4.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for twisterl-0.4.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 d35e1a498af1bb00dacd6989e061a027745023a82de2e5475c0bb305c8dfc574
MD5 15750a00e8c70d9f4cb5f6669db1e85a
BLAKE2b-256 4e5ab4a0f12fb15f3a00063ca83f1122873ed04644ee2e9e943219b3630aabd7

See more details on using hashes here.

File details

Details for the file twisterl-0.4.0-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: twisterl-0.4.0-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 318.1 kB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for twisterl-0.4.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 b7626468513a1166b377c7a6c20cb8e554566fb7700ebb2744bd17114154cbbf
MD5 e620df6c798e7cd332a89619fb07280f
BLAKE2b-256 6ef290211623e331f13c11016de0fcb472dea8dde5d72e95d1f1a213ebd7e586

See more details on using hashes here.

File details

Details for the file twisterl-0.4.0-cp310-cp310-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for twisterl-0.4.0-cp310-cp310-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 bd75e4667a323183b24171e61e11c89037231be9229846632315435fddc31f30
MD5 c1192cb48eddee3ab27902210fc38834
BLAKE2b-256 db028c43ad31b414aa88a2a12390aea7591452ea305b16cd6d9901a372ee7aa9

See more details on using hashes here.

File details

Details for the file twisterl-0.4.0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for twisterl-0.4.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a7ea0dfa60f8e1ea6081799de6ee677850ae742ff8c2259996571f14f6d8cdc2
MD5 c0b09369554485aeeb5ad83c76f40fda
BLAKE2b-256 190c44398209f6afcb23db19cbcc23a5e06764701ab75e268663925a2016b1f4

See more details on using hashes here.

File details

Details for the file twisterl-0.4.0-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: twisterl-0.4.0-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 319.7 kB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.13

File hashes

Hashes for twisterl-0.4.0-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 b73e444ba263b9fbde0df179dca2dc7984b8cf384e15506a92687ce903226911
MD5 a8b1366abe7f7e1ec9efe017bcbcc59d
BLAKE2b-256 1a01e7745ff5cfbf0448c01ca357f61f88f2f059ba7c43ac6c2bbd9f3dd2aaa5

See more details on using hashes here.

File details

Details for the file twisterl-0.4.0-cp39-cp39-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for twisterl-0.4.0-cp39-cp39-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 476c36b7c9b3d419e6f568743abc6db79537a8cf33346604ab6a515227ffeac9
MD5 4bcaa5c098915ccdce91ea673b641040
BLAKE2b-256 7ddc0619c477034e442b1082337ec89c01b13bee8d4f66eeca3760f51f4fbf3b

See more details on using hashes here.

File details

Details for the file twisterl-0.4.0-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for twisterl-0.4.0-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 5d8a22ca60e0656977ebd44b08ae0c9327794bdfc3ff1ce297d64f07702fc359
MD5 ab4b77aaa2b2a09b2c11c73b3144bb28
BLAKE2b-256 111ae0410dd5579c867b40aa10309fafad659c3009ad6b2062110b7792bc429e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page