Skip to main content

Fast reinforcement learning 💨

Project description

flashrl

RL in seconds 💨 with ~200 lines of code (+ ~150 per env) 🤓

🛠️ pip install flashrl, or if you want to modify envs, clone the repo and pip install -r requirements.txt

Quick Start 🚀

  1. If installed via clone, compile envs: python setup.py build_ext --inplace
  2. Train: python train.py
  3. See the magic unfold in the terminal 🪄

Usage 💡

flashrl will always be short: Read the code (+paste into ChatGPT) to understand it!

Here's a minimal example to get you going:

flashrl uses a Learner that holds an env and a model (default: LSTMPolicy)

import flashrl as frl

learn = frl.Learner(env=frl.envs.Pong(n_agents=2**14))
curves = learn.fit(40, steps=16, pbar_desc='done')
frl.print_ascii_curve(curves['loss'], label='loss')
frl.render_ascii(learn, fps=10)
learn.env.close()

.fit triggers RL with

  • 40 iterations...
  • ...16 steps per iteration...
  • ...in Pong holding 2**14=16384 agents

resulting in training with (40 * 16 * 16384=)~10 million steps!

Click here, to read a tiny doc 📑

.fit takes the arguments

  • iters: Number of iterations
  • steps: Number of steps in rollout
  • pbar_desc: Progress bar description (default: 'reward')
  • log: If True, tensorboard logging is enabled
    • run tensorboard --logdir=runsand visit http://localhost:6006 in the browser!
  • lr, anneal_lr, target_fl + all args of ppo: Hyperparameters

Take a look at train.py to see how to use the utils-functions

  • print_ascii_curve: Visualizes the loss across the iters
  • render_ascii: Shows data of the last rollout in the terminal
  • render_gif: Shows the same, saved as a GIF
  • print_table: Shows a table of values, acts, logprobs, reward and dones of the last rollout

Environments 🕹️

Each env is one Cython(=.pyx) file in flashrl/envs. That's it!

To add custom envs, use grid.pyx, pong.pyx or multigrid.pyx as a template:

  • grid.pyx for single-agent envs
  • pong.pyx for 1 vs. 1 agent envs (AlphaZero-style)
  • multigrid.pyx for multi-agent envs
Grid Pong MultiGrid
Agent must reach goal Good old pong (1 vs. 1) Agent must reach goal first
grid pong multigrid

Acknowledgements 🙌

I want to thank

and last but not least...

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

flashrl-0.0.5-cp310-cp310-musllinux_1_2_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.10musllinux: musl 1.2+ x86-64

flashrl-0.0.5-cp310-cp310-musllinux_1_2_i686.whl (1.5 MB view details)

Uploaded CPython 3.10musllinux: musl 1.2+ i686

flashrl-0.0.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

flashrl-0.0.5-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl (1.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ i686manylinux: glibc 2.5+ i686

File details

Details for the file flashrl-0.0.5-cp310-cp310-musllinux_1_2_x86_64.whl.

File metadata

  • Download URL: flashrl-0.0.5-cp310-cp310-musllinux_1_2_x86_64.whl
  • Upload date:
  • Size: 1.6 MB
  • Tags: CPython 3.10, musllinux: musl 1.2+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/34.0 requests-toolbelt/0.9.1 requests/2.25.1 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.12

File hashes

Hashes for flashrl-0.0.5-cp310-cp310-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 9e34c1a5b73667fbf1813671e13d8cd998de4d8ed9c0e460fa1ec97a036f8de1
MD5 162016f91d49f949df1c7d2725eeb49d
BLAKE2b-256 f5828dd34142f935239e7b36826655b86f4e3b77693af64cf21954ad63357a16

See more details on using hashes here.

File details

Details for the file flashrl-0.0.5-cp310-cp310-musllinux_1_2_i686.whl.

File metadata

  • Download URL: flashrl-0.0.5-cp310-cp310-musllinux_1_2_i686.whl
  • Upload date:
  • Size: 1.5 MB
  • Tags: CPython 3.10, musllinux: musl 1.2+ i686
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/34.0 requests-toolbelt/0.9.1 requests/2.25.1 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.12

File hashes

Hashes for flashrl-0.0.5-cp310-cp310-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 9ad8e6d23ea7f3424f9718f944a9fbeb59bd48fa097e765c10701e92338cf97e
MD5 85b3f4025d5c38dfa2593e820c905dab
BLAKE2b-256 a2dcd3f88343eafe24fea4daf4ea8df1eab5d752aeeeb57ea97ac0c1e70c0bb3

See more details on using hashes here.

File details

Details for the file flashrl-0.0.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

  • Download URL: flashrl-0.0.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
  • Upload date:
  • Size: 1.6 MB
  • Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/34.0 requests-toolbelt/0.9.1 requests/2.25.1 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.12

File hashes

Hashes for flashrl-0.0.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a242354fff837ef7e71307fc15b6098f3fcac34d9e6e03fe83324f5ced5c4dea
MD5 97a018e5782f2ff6d7d0d867289e46db
BLAKE2b-256 9c7c28be639b9c43e7cd9c135d4b69b588d0e8aea4d5ba0f1e8335421a2272e8

See more details on using hashes here.

File details

Details for the file flashrl-0.0.5-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for flashrl-0.0.5-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 8c2da0ae7113ac1d17f88fe03d8530697f8854d6df890ab25651a5337f007900
MD5 108db1f98c6fb3736a69c61b8d132100
BLAKE2b-256 9f5da26e195af9e6503de405ab9c807b5e0bff401077e837643d621d8fe90689

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page