Fast reinforcement learning 💨
Project description
flashrl
flashrl does RL with millions of steps/second 💨 while being tiny: ~200 lines of code
🛠️ pip install flashrl or clone the repo & pip install -r requirements.txt
- If cloned (or if envs changed), compile:
python setup.py build_ext --inplace
💡 flashrl will always be tiny: Read the code (+paste into LLM) to understand it!
Quick Start 🚀
flashrl uses a Learner that holds an env and a model (default: Policy with LSTM)
import flashrl as frl
learn = frl.Learner(frl.envs.Pong(n_agents=2**14))
curves = learn.fit(40, steps=16, desc='done')
frl.print_curve(curves['loss'], label='loss')
frl.play(learn.env, learn.model, fps=8)
learn.env.close()
.fit does RL with ~10 million steps: 40 iterations × 16 steps × 2**14 agents!
Run it yourself via python train.py and play against the AI 🪄
Click here, to read a tiny doc 📑
Learner takes the arguments
env: RL environmentmodel: APolicymodeldevice: Per default picksmpsorcudaif available elsecpudtype: Per defaulttorch.bfloat16if device iscudaelsetorch.float32compile_no_lstm: Speedup viatorch.compileifmodelhas nolstm**kwargs: Passed to thePolicy, e.g.hidden_sizeorlstm
Learner.fit takes the arguments
iters: Number of iterationssteps: Number of steps inrolloutdesc: Progress bar description (e.g.'reward')log: IfTrue,tensorboardlogging is enabled- run
tensorboard --logdir=runsand visithttp://localhost:6006in the browser!
- run
stop_func: Function that stops training if it returnsTruee.g.
...
def stop(kl, **kwargs):
return kl > .1
curves = learn.fit(40, steps=16, stop_func=stop)
...
lr,anneal_lr& args ofppoafterbs: Hyperparameters
The most important functions in flashrl/utils.py are
print_curve: Visualizes the loss across theitersplay: Plays the environment in the terminal and takesmodel: APolicymodelplayable: IfTrue, allows you to act (or decide to let the model act)steps: Number of stepsfps: Frames per secondobs: Argument of the env that should be rendered as observationsdump: IfTrue, no frame refresh -> Frames accumulate in the terminalidx: Agent index between0andn_agents(default:0)
Environments 🕹️
Each env is one Cython(=.pyx) file in flashrl/envs. That's it!
To add custom envs, use grid.pyx, pong.pyx or multigrid.pyx as a template:
grid.pyxfor single-agent envs (~110 LOC)pong.pyxfor 1 vs 1 agent envs (~150 LOC)multigrid.pyxfor multi-agent envs (~190 LOC)
Grid |
Pong |
MultiGrid |
|---|---|---|
| Agent must reach goal | Agent must score | Agent must reach goal first |
Acknowledgements 🙌
I want to thank
- Joseph Suarez for open sourcing RL envs in C(ython)! Star PufferLib ⭐
- Costa Huang for open sourcing high-quality single-file RL code! Star cleanrl ⭐
and last but not least...
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file flashrl-0.2.1-cp313-cp313-win_amd64.whl.
File metadata
- Download URL: flashrl-0.2.1-cp313-cp313-win_amd64.whl
- Upload date:
- Size: 694.6 kB
- Tags: CPython 3.13, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
56ff9314164007fbf92d88bc41e595b3a92239ac84d005e44a0b477d1609016b
|
|
| MD5 |
5afe3bc7a0556c46885b6140c9b43fd5
|
|
| BLAKE2b-256 |
1831a94be16246bd695bc1a55de8bddeb44268edf5c81ad6623f659e69eaa87b
|
File details
Details for the file flashrl-0.2.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: flashrl-0.2.1-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 1.7 MB
- Tags: CPython 3.13, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f9af705b7b1662a3d5ab3b8e1b69d8ae16747020fc753b90f5b5ec1a0e87924a
|
|
| MD5 |
cbdaca4d9891aba7ad12e76d16f46759
|
|
| BLAKE2b-256 |
cc5aa28e3af1e200199d857056696296088c04eceac536a8b23a8cd39ed80861
|
File details
Details for the file flashrl-0.2.1-cp313-cp313-macosx_10_13_universal2.whl.
File metadata
- Download URL: flashrl-0.2.1-cp313-cp313-macosx_10_13_universal2.whl
- Upload date:
- Size: 961.5 kB
- Tags: CPython 3.13, macOS 10.13+ universal2 (ARM64, x86-64)
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d24bc5ee9a7409a96479e1acb23cfae78855342e9be6dbbfbfb066a93a705a3e
|
|
| MD5 |
9d7617fc425be52f809556b0d81d5b39
|
|
| BLAKE2b-256 |
f3243e84a6dee6e6bf888b2768fdb56a738e65f856d3c877dc3510b74cc26229
|
File details
Details for the file flashrl-0.2.1-cp312-cp312-win_amd64.whl.
File metadata
- Download URL: flashrl-0.2.1-cp312-cp312-win_amd64.whl
- Upload date:
- Size: 694.8 kB
- Tags: CPython 3.12, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d9030ffa184d85e8211d61bb00c07904c4d95a50572e8f32f05e6fe72b900f98
|
|
| MD5 |
40c354ed157625dac55a8cb90813ce68
|
|
| BLAKE2b-256 |
21b5491ab3bab7f72680b2c30a4314919bf896c4f2e8ed720432c5ed4d5ffded
|
File details
Details for the file flashrl-0.2.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: flashrl-0.2.1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 1.7 MB
- Tags: CPython 3.12, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
04bd1cbc528ce8e7d12da12fdd3a7871835a704c60e0294aa81ed98b8a51a6a4
|
|
| MD5 |
fdb1e878eaf487fb09df6adcab1bb81c
|
|
| BLAKE2b-256 |
9cd5d0b38ef177e560dc8686d41a1b790b7a9bdb24fbf2e25d9439b671719186
|
File details
Details for the file flashrl-0.2.1-cp312-cp312-macosx_10_13_universal2.whl.
File metadata
- Download URL: flashrl-0.2.1-cp312-cp312-macosx_10_13_universal2.whl
- Upload date:
- Size: 965.7 kB
- Tags: CPython 3.12, macOS 10.13+ universal2 (ARM64, x86-64)
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
39b1d39387b4974d72796f111771f2e2aff68d5da432a936b25b53384eda671d
|
|
| MD5 |
69bc4fdc006259bf5f22b4f1c7c7a534
|
|
| BLAKE2b-256 |
131a032237c0b26ab9391438dfbe69a7d6ab05bdd9cc839a36b91fd234d30834
|
File details
Details for the file flashrl-0.2.1-cp311-cp311-win_amd64.whl.
File metadata
- Download URL: flashrl-0.2.1-cp311-cp311-win_amd64.whl
- Upload date:
- Size: 689.1 kB
- Tags: CPython 3.11, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2df43113f20e8cb61bf691d8abef464d59f8aa81be65f52516ffb306a8ea005c
|
|
| MD5 |
988a946eadb2da97936c3c1441b3de76
|
|
| BLAKE2b-256 |
507b82a23af23407008cbde43a645e723abc230d2a866136414abfc654fedb4e
|
File details
Details for the file flashrl-0.2.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: flashrl-0.2.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 1.6 MB
- Tags: CPython 3.11, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
88ea838f4c41eb16eda2f153dc16e6d6b3d0ea596c27432fdf971f51f390e9aa
|
|
| MD5 |
228face91730716509357f6361d45ed0
|
|
| BLAKE2b-256 |
47d066845654e40e38d995f5195f2b9a1296df8899e0fa4b87dbb8da40bb11e8
|
File details
Details for the file flashrl-0.2.1-cp311-cp311-macosx_10_9_universal2.whl.
File metadata
- Download URL: flashrl-0.2.1-cp311-cp311-macosx_10_9_universal2.whl
- Upload date:
- Size: 960.7 kB
- Tags: CPython 3.11, macOS 10.9+ universal2 (ARM64, x86-64)
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d6ca678c31ce7d1b443117c8ba78d2249fc1bfa00155d76a6a6cdd00a8ce7555
|
|
| MD5 |
48b24b3f358d377fe7ed15c5eb17e46e
|
|
| BLAKE2b-256 |
38a39d24be919239b82b7c02a7fc35b2b0e3c594dbceabda3241465bfb00eeed
|
File details
Details for the file flashrl-0.2.1-cp310-cp310-win_amd64.whl.
File metadata
- Download URL: flashrl-0.2.1-cp310-cp310-win_amd64.whl
- Upload date:
- Size: 689.0 kB
- Tags: CPython 3.10, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2efbb5759d76b8ab116061c19633ef0a2b371634c27c839c954e147d100d85ca
|
|
| MD5 |
8341f602336d398113155d2211336a33
|
|
| BLAKE2b-256 |
d139d1cf206ab47aa3974143a0031e8281af7f254ae50400bb6d828253be5bc8
|
File details
Details for the file flashrl-0.2.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: flashrl-0.2.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 1.6 MB
- Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ebe75e4cd4838ca3c248485837703a3b19223ad8fa6654bc00a9b0fef60dbacf
|
|
| MD5 |
e24eb1ef466e6d92653cb523e700f6d3
|
|
| BLAKE2b-256 |
aea22fce780b2bdae5ebcb9e47ef1d3a5788a1a0b7ec3ccfe5c1de0d6b934fdd
|
File details
Details for the file flashrl-0.2.1-cp310-cp310-macosx_10_9_universal2.whl.
File metadata
- Download URL: flashrl-0.2.1-cp310-cp310-macosx_10_9_universal2.whl
- Upload date:
- Size: 960.0 kB
- Tags: CPython 3.10, macOS 10.9+ universal2 (ARM64, x86-64)
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2c41e7d2df4df5845d3a14e95748a7809959f8e329bb6c93166b5c8f14b8ed1b
|
|
| MD5 |
fda9f0f049b9c74e86c4d1e4f19f7cf8
|
|
| BLAKE2b-256 |
f6e23459be0242e75d4d687e48037024c2d3294fb84a13401d3fd609bf6b10e9
|