Skip to main content

Fast reinforcement learning 💨

Project description

flashrl

RL in seconds 💨 with ~200 lines of code (+ ~150 per env) 🤓

🛠️ pip install flashrl, or if you want to modify envs, clone the repo and pip install -r requirements.txt

Quick Start 🚀

  1. If installed via clone, compile envs: python setup.py build_ext --inplace
  2. Train: python train.py
  3. See the magic unfold in the terminal 🪄

Usage 💡

flashrl will always be short: Read the code (+paste into ChatGPT) to understand it!

Here's a minimal example to get you going:

flashrl uses a Learner that holds an env and a model (default: LSTMPolicy)

import flashrl as frl

learn = frl.Learner(env=frl.envs.Pong(n_agents=2**14))
curves = learn.fit(40, steps=16, pbar_desc='done')
frl.print_ascii_curve(curves['loss'], label='loss')
frl.render_ascii(learn, fps=10)
learn.env.close()

.fit triggers RL with

  • 40 iterations...
  • ...16 steps per iteration...
  • ...in Pong holding 2**14=16384 agents

resulting in training with (40 * 16 * 16384=)~10 million steps!

Click here, to read a tiny doc 📑

.fit takes the arguments

  • iters: Number of iterations
  • steps: Number of steps in rollout
  • pbar_desc: Progress bar description (default: 'reward')
  • log: If True, tensorboard logging is enabled
    • run tensorboard --logdir=runsand visit http://localhost:6006 in the browser!
  • lr, anneal_lr, target_fl + all args of ppo: Hyperparameters

Take a look at train.py to see how to use the utils-functions

  • print_ascii_curve: Visualizes the loss across the iters
  • render_ascii: Shows data of the last rollout in the terminal
  • render_gif: Shows the same, saved as a GIF
  • print_table: Shows a table of values, acts, logprobs, reward and dones of the last rollout

Environments 🕹️

Each env is one Cython(=.pyx) file in flashrl/envs. That's it!

To add custom envs, use grid.pyx, pong.pyx or multigrid.pyx as a template:

  • grid.pyx for single-agent envs
  • pong.pyx for 1 vs. 1 agent envs (AlphaZero-style)
  • multigrid.pyx for multi-agent envs
Grid Pong MultiGrid
Agent must reach goal Good old pong (1 vs. 1) Agent must reach goal first
grid pong multigrid

Acknowledgements 🙌

I want to thank

and last but not least...

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

flashrl-0.0.4-cp310-cp310-musllinux_1_2_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.10musllinux: musl 1.2+ x86-64

flashrl-0.0.4-cp310-cp310-musllinux_1_2_i686.whl (1.5 MB view details)

Uploaded CPython 3.10musllinux: musl 1.2+ i686

flashrl-0.0.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

flashrl-0.0.4-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl (1.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ i686manylinux: glibc 2.5+ i686

File details

Details for the file flashrl-0.0.4-cp310-cp310-musllinux_1_2_x86_64.whl.

File metadata

  • Download URL: flashrl-0.0.4-cp310-cp310-musllinux_1_2_x86_64.whl
  • Upload date:
  • Size: 1.6 MB
  • Tags: CPython 3.10, musllinux: musl 1.2+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/34.0 requests-toolbelt/0.9.1 requests/2.25.1 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.12

File hashes

Hashes for flashrl-0.0.4-cp310-cp310-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 9a0a6bb8a939c8168fac6a9ff90ca90840a7a081803d14b2cc2a24b1d2b366c2
MD5 c03a665b063f958228623dd9352e3cbe
BLAKE2b-256 840f4d1c601c74a53374cd4fc02329b90c11601e9f734e6e883d4dcc438f05cb

See more details on using hashes here.

File details

Details for the file flashrl-0.0.4-cp310-cp310-musllinux_1_2_i686.whl.

File metadata

  • Download URL: flashrl-0.0.4-cp310-cp310-musllinux_1_2_i686.whl
  • Upload date:
  • Size: 1.5 MB
  • Tags: CPython 3.10, musllinux: musl 1.2+ i686
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/34.0 requests-toolbelt/0.9.1 requests/2.25.1 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.12

File hashes

Hashes for flashrl-0.0.4-cp310-cp310-musllinux_1_2_i686.whl
Algorithm Hash digest
SHA256 32ceea2cae12d76a290ab17662b87b3fb5c612a12b2fc2ea12ededf2d7239b8c
MD5 6e46dc34913a1cd13c900140507bab5d
BLAKE2b-256 4a01de19c3fd2fb1c163edec4e15dd19f99ce29967d4fc7b8193f0c396ce9f03

See more details on using hashes here.

File details

Details for the file flashrl-0.0.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

  • Download URL: flashrl-0.0.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
  • Upload date:
  • Size: 1.6 MB
  • Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/34.0 requests-toolbelt/0.9.1 requests/2.25.1 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.12

File hashes

Hashes for flashrl-0.0.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 2857a54b8b0c24e12996e807a5dcaeb03cc851176f885148448f71d8640f9f69
MD5 7f27340b9159777cc2451dff9c9d3cf0
BLAKE2b-256 32517608ac218cfc4b8e2eec578b72b65d43c7a29094c9874ded1c3c341e6233

See more details on using hashes here.

File details

Details for the file flashrl-0.0.4-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for flashrl-0.0.4-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 2729a96bdfa69b3e148537ad3254be5df77db84e2d9133fec3c1bc71ebcb1376
MD5 82d5945f0952d86e6e07f5f765ee1fa3
BLAKE2b-256 2bfa16d356dea4310029417cd0cee1be3a04986f22a7bba3e54df6c0f5d4fc9f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page