Skip to main content

Minimal RL in Rust

Project description

TwisteRL

TwisteRL

A minimalistic, high-performance Reinforcement Learning framework implemented in Rust.

The current version is a Proof of Concept, stay tuned for future releases!

Install

pip install .

Use

Training

python -m twisterl.train --config examples/ppo_puzzle8_v1.json

This example trains a model to play the popular "8 puzzle":

|8|7|5|
|3|2| |
|4|6|1|

where numbers have to be shifted around through the empty slot until they are in order.

This model can be trained on a single CPU in under 1 minute (no GPU required!). A larger version (4x4) is available: examples/ppo_puzzle15_v1.json.

Inference

Check the notebook example here!

Creating your own environment

The examples/grid_world custom environment example here shows how to implement an environment in Rust and expose it to Python with PyO3. You can use it as a template:

  1. Create a new crate

    cargo new --lib examples/my_env
    
  2. Add dependencies in examples/my_env/Cargo.toml:

    [package]
    name = "my_env"
    version = "0.1.0"
    edition = "2021"
    
    [lib]
    name = "my_env"
    crate-type = ["cdylib"]
    
    [dependencies]
    pyo3 = { version = "0.20", features = ["extension-module"] }
    twisterl = { path = "path/to/twisterl/rust", features = ["python_bindings"] }
    # Or using the official crate:
    # twisterl = { version = "a.b.c", features = ["python_bindings"] }
    
  3. Implement the environment by defining a struct and implementing twisterl::rl::env::Env for it. Provide logic for reset, step, observe, reward, etc.

  4. Expose it to Python using PyBaseEnv:

    use pyo3::prelude::*;
    use twisterl::python_interface::env::PyBaseEnv;
    
    #[pyclass(name = "MyEnv", extends = PyBaseEnv)]
    struct PyMyEnv;
    
    #[pymethods]
    impl PyMyEnv {
        #[new]
        fn new(...) -> (Self, PyBaseEnv) {
            let env = MyEnv::new(...);
            (PyMyEnv, PyBaseEnv { env: Box::new(env) })
        }
    }
    
  5. Add a pyproject.toml describing the Python package so maturin can build a wheel.

  6. Build and install the module:

    pip install .
    
  7. Use it from Python:

    import my_env
    env = my_env.MyEnv(...)
    obs = env.reset()
    

Refer to grid_world for a complete working example.

Documentation

🚀 Key Features

  • High-Performance Core: RL episode loop implemented in Rust for faster training and inference
  • Inference-Ready: Easy compilation and bundling of models with environments into portable binaries for inference
  • Modular Design: Support for multiple algorithms (PPO, AlphaZero) with interchangeable training and inference
  • Language Interoperability: Core in Rust with Python interface
  • Symmetry-Aware Training via Twists: Environments can expose observation/action permutations (“twists”) so policies automatically exploit device or puzzle symmetries for faster learning.

🏗️ Current State (PoC)

  • Hybrid rust-python implementation:
    • Data collection and inference in Rust
    • Training in Python (PyTorch)
  • Supported algorithms:
    • PPO (Proximal Policy Optimization)
    • AlphaZero
  • Focus on discrete observation and action spaces
  • Support for native Rust environments and for Python environments through a wrapper

🚧 Roadmap

Upcoming Features (Alpha Version)

  • Full training in Rust
  • Extended support for:
    • Continuous observation spaces
    • Continuous action spaces
    • Custom policy architectures
  • Native WebAssembly environment support
  • Streamlined policy+environment bundle export to WebAssembly
  • Comprehensive Python interface
  • Enhanced documentation and test coverage

💎 Future Possibilities

  • WebAssembly environment repository
  • Browser-based environment and agent visualization
  • Interactive web demonstrations
  • Serverless distributed training

🎮 Use Cases

Currently used in:

Perfect for:

  • Puzzle-like optimization problems
  • Any scenario requiring fast, production performance RL inference

🔧 Current Limitations

  • Limited to discrete observation and action spaces
  • Python environments may create performance bottlenecks
  • Documentation and testing coverage is currently minimal
  • WebAssembly support is in development

🤝 Contributing

We're in early development stages and welcome contributions! Stay tuned for more detailed contribution guidelines.

📄 Note

This project is currently in PoC stage. While functional, it's under active development and the API may change significantly.

📜 License

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

twisterl-0.3.0-cp312-cp312-win_amd64.whl (317.6 kB view details)

Uploaded CPython 3.12Windows x86-64

twisterl-0.3.0-cp312-cp312-manylinux_2_34_x86_64.whl (458.2 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

twisterl-0.3.0-cp312-cp312-macosx_11_0_arm64.whl (413.1 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

twisterl-0.3.0-cp312-cp312-macosx_10_12_x86_64.whl (428.5 kB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

twisterl-0.3.0-cp311-cp311-win_amd64.whl (318.0 kB view details)

Uploaded CPython 3.11Windows x86-64

twisterl-0.3.0-cp311-cp311-manylinux_2_34_x86_64.whl (457.4 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.34+ x86-64

twisterl-0.3.0-cp311-cp311-macosx_11_0_arm64.whl (412.4 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

twisterl-0.3.0-cp311-cp311-macosx_10_12_x86_64.whl (431.8 kB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

twisterl-0.3.0-cp310-cp310-win_amd64.whl (317.9 kB view details)

Uploaded CPython 3.10Windows x86-64

twisterl-0.3.0-cp310-cp310-manylinux_2_34_x86_64.whl (457.4 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.34+ x86-64

twisterl-0.3.0-cp310-cp310-macosx_11_0_arm64.whl (413.6 kB view details)

Uploaded CPython 3.10macOS 11.0+ ARM64

twisterl-0.3.0-cp310-cp310-macosx_10_12_x86_64.whl (431.8 kB view details)

Uploaded CPython 3.10macOS 10.12+ x86-64

twisterl-0.3.0-cp39-cp39-win_amd64.whl (319.2 kB view details)

Uploaded CPython 3.9Windows x86-64

twisterl-0.3.0-cp39-cp39-manylinux_2_34_x86_64.whl (457.9 kB view details)

Uploaded CPython 3.9manylinux: glibc 2.34+ x86-64

twisterl-0.3.0-cp39-cp39-macosx_11_0_arm64.whl (415.6 kB view details)

Uploaded CPython 3.9macOS 11.0+ ARM64

twisterl-0.3.0-cp39-cp39-macosx_10_12_x86_64.whl (433.1 kB view details)

Uploaded CPython 3.9macOS 10.12+ x86-64

File details

Details for the file twisterl-0.3.0-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: twisterl-0.3.0-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 317.6 kB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.10

File hashes

Hashes for twisterl-0.3.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 05de582af646cf3b16adbb275d47300b1a3182c1e547f61b76ea5162b72feb02
MD5 53b681edb2c8d0792c90296a182cef9b
BLAKE2b-256 519e06f8d6418157d419ba70c15990b3960120e87fb0b66f955da8ab30026833

See more details on using hashes here.

File details

Details for the file twisterl-0.3.0-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for twisterl-0.3.0-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 43aa1d9a77da55af3858e6f7c13d59162ac5e0cec0b8634e676370801d152fc5
MD5 324fdc3457bbd148505a270de14e8936
BLAKE2b-256 8b620aca24fbda7284675d29b5d79e2d99d697867e0be041f5811e2cec30bed0

See more details on using hashes here.

File details

Details for the file twisterl-0.3.0-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for twisterl-0.3.0-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 dd4cff5d21459f01f52aaa236f2119ebc0d336c2652355c8b01a707de2a26237
MD5 46f7400439e101e449fcc761017fbd23
BLAKE2b-256 261346acb48739a1b40e3b60322879ee6b9e3b89b9eca4b452d46d731b5b2167

See more details on using hashes here.

File details

Details for the file twisterl-0.3.0-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for twisterl-0.3.0-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 18e795b4869ada92e2627f114c5bde9b21216cad47189fc55987fa44745c4ea4
MD5 aef2c06bae1a34899030a8696ef3ed88
BLAKE2b-256 4a7c1d967920ae95ac96bff4130e39bb489392751a6a4312ade17a08074d5653

See more details on using hashes here.

File details

Details for the file twisterl-0.3.0-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: twisterl-0.3.0-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 318.0 kB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.11.9

File hashes

Hashes for twisterl-0.3.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 3c32e8c3d6e8496ade537b2c286b4b0c939b1110a0355e886b3d77bf1f44e535
MD5 8a11b3ecd28bcad662031f66caa990cb
BLAKE2b-256 c1ac2d691e41e9490058904517d175f5fc28a08102916a5a4862dfd320c9ac5e

See more details on using hashes here.

File details

Details for the file twisterl-0.3.0-cp311-cp311-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for twisterl-0.3.0-cp311-cp311-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 77db3d471fd1eb80ed46efe04105b740c509e4ec5160f18f759d90572808a783
MD5 c7819f6caf03d895774f6851d3ef0439
BLAKE2b-256 b2d257b8806b02cad8e29fa488875837f7c062fc1db70e031c4f1808443a562e

See more details on using hashes here.

File details

Details for the file twisterl-0.3.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for twisterl-0.3.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 8ae7e22f7a7fdce48636b1ee1ad8585b435a097cdf3d98b14d8a8cb839584df8
MD5 f9ec74455642c90ce163fb2026897aaf
BLAKE2b-256 dde79f3c4dfc6e127fed42fef660271f23a42a815851668f5595b8e870425e07

See more details on using hashes here.

File details

Details for the file twisterl-0.3.0-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for twisterl-0.3.0-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 ddce0ca2b702045f25c3674f69a95b791a34675a4d030fa0d65749c0c40d4d56
MD5 eddfa515de117aa0546fa8353be31d42
BLAKE2b-256 e129df498c111eff3adbf2b1fb5a5bcceaa6a4c2f639332ae01d059d169d70ba

See more details on using hashes here.

File details

Details for the file twisterl-0.3.0-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: twisterl-0.3.0-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 317.9 kB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.11

File hashes

Hashes for twisterl-0.3.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 c071d69db3592d61d1d8f28d467d0dbcc462e5cf7154b9070fa70b9f421458f1
MD5 e40ff32a9708b11a1889f62e50de1575
BLAKE2b-256 ae3899c4abef381d5ba95b57521362664694524d697c330869406d207cd1d865

See more details on using hashes here.

File details

Details for the file twisterl-0.3.0-cp310-cp310-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for twisterl-0.3.0-cp310-cp310-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 238e0e6d35bc880480ea45205493143d61826399b72b5737d5a03b0e9bf886cd
MD5 60f4f405af1a22bbf84b5a9f63c05f78
BLAKE2b-256 8c71e0e7f6651c5320b17920e1779d5562d171163ddd32dbc7b38ba8baf7b20f

See more details on using hashes here.

File details

Details for the file twisterl-0.3.0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for twisterl-0.3.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 2edbf92a116ecd7a13fd80c3b2eee1b2c28974973939c163b79814c22486a059
MD5 a70133905c60793898866727980a1162
BLAKE2b-256 f713f1b85db2fe9686e4a0afd1a1df07f1252494e81e6e5cb1abab261a9d08c6

See more details on using hashes here.

File details

Details for the file twisterl-0.3.0-cp310-cp310-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for twisterl-0.3.0-cp310-cp310-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 438b70fd262f2ebce14212b778122b2143fb4621c8600cc7ce72d2cc7f79b0ab
MD5 9eb162ebd08841662d722e8d4f748641
BLAKE2b-256 b81aa8ffb1361f0c4b484ef8a36bcb92d596ba7ace7c4cc6f5f1ac4fde9c5e18

See more details on using hashes here.

File details

Details for the file twisterl-0.3.0-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: twisterl-0.3.0-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 319.2 kB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.13

File hashes

Hashes for twisterl-0.3.0-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 e5d4bcd2ef06a9c318ceeb29061615c04c9ae0b65789ae0c415ce72e001b34dc
MD5 0923df49c0609c60081bb36f98ba7025
BLAKE2b-256 547788daddf3e93ab97fc787c2715bf762689fc96b34ca617beaf4fe104d2e87

See more details on using hashes here.

File details

Details for the file twisterl-0.3.0-cp39-cp39-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for twisterl-0.3.0-cp39-cp39-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 737ca385449b6a6e2d6de89088e5076248b2a90fbbca7ccb56c824e5b35c11a1
MD5 de1126b699f64f93f86e36ec13af1eed
BLAKE2b-256 8487e1bed1ed74c0626f862bce8f5b9c051c41c47e5cc3ed7f29f97d9b6d9d80

See more details on using hashes here.

File details

Details for the file twisterl-0.3.0-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for twisterl-0.3.0-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 b62c044353f2d2c9592bd2ad14b6c06c7594e0fb9a979375f51ffe50da5136a1
MD5 05c9b76c4ef00a0a81746a8fb3f8d61e
BLAKE2b-256 b3beed071c3ff68284fd549ba26b25103ac80920b4a56d8c6bcd7198eff6b4ee

See more details on using hashes here.

File details

Details for the file twisterl-0.3.0-cp39-cp39-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for twisterl-0.3.0-cp39-cp39-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 bbe382f170cb180c5364ef9aa2b92069520e4d85cfbb69fb94a757c554e32ab0
MD5 5ce5afae075f8edd57601539a5d8167c
BLAKE2b-256 d0a29dfbcf2206756a563c7029f5c8631b729dd57d0e0d56046effd8ac3274d1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page