Skip to main content

Fast reinforcement learning 💨

Project description

flashrl

RL in seconds 💨 with ~200 lines of code (+ ~150 per env) 🤓

🛠️ pip install flashrl, or if you want to modify envs, clone the repo and pip install -r requirements.txt

Quick Start 🚀

  1. If installed via clone, compile envs: python setup.py build_ext --inplace
  2. Train: python train.py
  3. See the magic unfold in the terminal 🪄

Usage 💡

flashrl will always be short: Read the code (+paste into ChatGPT) to understand it!

Here's a minimal example to get you going:

flashrl uses a Learner that holds an env and a model (default: LSTMPolicy)

import flashrl as frl

learn = frl.Learner(env=frl.envs.Pong(n_agents=2**14))
curves = learn.fit(40, steps=16, pbar_desc='done')
frl.print_ascii_curve(curves['loss'], label='loss')
frl.render_ascii(learn, fps=10)
learn.env.close()

.fit triggers RL with

  • 40 iterations...
  • ...16 steps per iteration...
  • ...in Pong holding 2**14=16384 agents

resulting in training with (40 * 16 * 16384=)~10 million steps!

Click here, to read a tiny doc 📑

.fit takes the arguments

  • iters: Number of iterations
  • steps: Number of steps in rollout
  • pbar_desc: Progress bar description (default: 'reward')
  • log: If True, tensorboard logging is enabled
    • run tensorboard --logdir=runsand visit http://localhost:6006 in the browser!
  • lr, anneal_lr, target_fl + all args of ppo: Hyperparameters

Take a look at train.py to see how to use the utils-functions

  • print_ascii_curve: Visualizes the loss across the iters
  • render_ascii: Shows data of the last rollout in the terminal
  • render_gif: Shows the same, saved as a GIF
  • print_table: Shows a table of values, acts, logprobs, reward and dones of the last rollout

Environments 🕹️

Each env is one Cython(=.pyx) file in flashrl/envs. That's it!

To add custom envs, use grid.pyx, pong.pyx or multigrid.pyx as a template:

  • grid.pyx for single-agent envs
  • pong.pyx for 1 vs. 1 agent envs (AlphaZero-style)
  • multigrid.pyx for multi-agent envs
Grid Pong MultiGrid
Agent must reach goal Good old pong (1 vs. 1) Agent must reach goal first
grid pong multigrid

Acknowledgements 🙌

I want to thank

and last but not least...

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

flashrl-0.0.3-cp310-cp310-musllinux_1_2_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.10musllinux: musl 1.2+ x86-64

flashrl-0.0.3-cp310-cp310-musllinux_1_2_i686.whl (1.5 MB view details)

Uploaded CPython 3.10musllinux: musl 1.2+ i686

flashrl-0.0.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

flashrl-0.0.3-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl (1.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ i686manylinux: glibc 2.5+ i686

File details

Details for the file flashrl-0.0.3-cp310-cp310-musllinux_1_2_x86_64.whl.

File metadata

  • Download URL: flashrl-0.0.3-cp310-cp310-musllinux_1_2_x86_64.whl
  • Upload date:
  • Size: 1.6 MB
  • Tags: CPython 3.10, musllinux: musl 1.2+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/34.0 requests-toolbelt/0.9.1 requests/2.25.1 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.12

File hashes

Hashes for flashrl-0.0.3-cp310-cp310-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 23678a90557059edbd7497a340f1f2629092df289734c9b9df9e0f85a088c8e0
MD5 4daa2793011961f976b5b46f932947c1
BLAKE2b-256 6961136ec4d3c981b9644daa00c77fb6c510786324094fa8abff69a9243f2864

See more details on using hashes here.

File details

Details for the file flashrl-0.0.3-cp310-cp310-musllinux_1_2_i686.whl.

File metadata

  • Download URL: flashrl-0.0.3-cp310-cp310-musllinux_1_2_i686.whl
  • Upload date:
  • Size: 1.5 MB
  • Tags: CPython 3.10, musllinux: musl 1.2+ i686
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/34.0 requests-toolbelt/0.9.1 requests/2.25.1 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.12

File hashes

Hashes for flashrl-0.0.3-cp310-cp310-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 37e418d5c40c6de78bfc6ae673200ad1518fb58941a78494cb81e9e43f29257f
MD5 e18409ef0f593d097d9f35909ec31f97
BLAKE2b-256 19dd4194cc7b367391bc17a90409b641e79fc754c7514a45c7245274c9d24aa3

See more details on using hashes here.

File details

Details for the file flashrl-0.0.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

  • Download URL: flashrl-0.0.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
  • Upload date:
  • Size: 1.6 MB
  • Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/34.0 requests-toolbelt/0.9.1 requests/2.25.1 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.12

File hashes

Hashes for flashrl-0.0.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 01f6d43434c820ef451fc54694075d7576e6543928cf3dcd345f1e60c90b9563
MD5 61cfce9c0d6b7f8c256318bdccdd518e
BLAKE2b-256 6d29c12d41e697d76348de3bb61d29ecc63e7ae2a0ac4ba526b9708f2e32913f

See more details on using hashes here.

File details

Details for the file flashrl-0.0.3-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for flashrl-0.0.3-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 7fde147327573917e610487f1097332dd86f47e364cc114fe3cf630b67e13864
MD5 5fbf535302e18a9bf877296fffdedbcf
BLAKE2b-256 afdd2fde9e9bb1ba2a30781db3cd1fed60e2d50074eb942e0bd23e091c6fc411

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page