Skip to main content

Cardio RL. In development...

Project description

GymCardio

Cardio offers to replace much of the boiler plate code in deep reinforcement learning algorithms; namely the data gathering and environment interaction, but also providing pre-written policies and replay buffers, all while aiming to do so in a modular fashion and allow users to implement their own algorithms or improvements to existing algorithms.

The purpose of this library was initially to speed up my own self-implemented versions of algorithms and reducing the code overhead while also reducing the actual number of lines written per script for each algorithm. You'll often find that the actual algorithm details often corresponds to very little code and handling of the environment and buffer causes even simple implementations to feel somewhat bloated. Thus cardio hopes to allow for simple, easy to read, one file implementations. In an effort to showcase the library I have included a number of 'stubs' that are just that.

This is still heavily a work in progress and even many of the (poorly organised) stubs do not work with the current versions of the runner. Going forward I will be chipping away at streamlining, organising and documenting this repository when I get the chance!

Cardio basics

This section will be overhauled at a later date, but now the gist of Cardio is the Runner class that gives you a simple interface that wraps the environment and then with one method, will step through the environment, collect transitions, and process them in your defined way. The runner supports n-step transitions, custom policies and custom processing. Currently the runner class is biased with off-policy algorithms in mind (altough fully supporting on-policy approaches too) but going forward it will be better balanced between the two.

Immediate to do list

  • Create package!

    • basic implementation done, need to do PyPi and look into further improvements
  • Implement replay buffer class and move IET work into Cardio ecosystem

  • Remove warmup method in gatherer and make it a special call of the rollout method

Vectorised env work

  • Align VectorCollector with Collector

    • Revisit A2C implementation
    • Make Logger compatible with VectorCollector
    • Learning doesnt seem to line up with stable baselines3 need to debug all aspects (collector, value estimation and logger)
  • Makes sure vector collector work as intended for off-policy methods and n-step collector work as intended for on-policy methods etc.

Lower priority

  • Offline gatherer

    • on pause until mujoco sorted
  • Minor refactor to gatherer and runner, add default arg values, careful consideration needed

    • mostly done, just some final decisions to make
    • change collector name to gatherer, idk why its different
  • Move agent stubs outside of src and using package instead of src

  • Sort policies better, i.e. discrete, continuous

  • Benchmark each implementation wrt. SB3 (change logging to timestep based first though)

Completed

  • Implement multibatch sampling for off-policy runner

  • Add episode length to logger and use the same names as SB3 for easy integration!

  • Parallel gatherer

  • Change logging from episodic to timestep based

    • include window and log_interval arguments to gatherer
  • Implement 'step_for' function in Collector!

  • Create dummy env for testing logging, collection and sampling!

  • Implement 'reduce' argument for n-step learning that returns unsqueezed vectors (for DRQN)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cardio_rl-0.0.5.tar.gz (12.7 kB view details)

Uploaded Source

Built Distribution

cardio_rl-0.0.5-py3-none-any.whl (13.2 kB view details)

Uploaded Python 3

File details

Details for the file cardio_rl-0.0.5.tar.gz.

File metadata

  • Download URL: cardio_rl-0.0.5.tar.gz
  • Upload date:
  • Size: 12.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for cardio_rl-0.0.5.tar.gz
Algorithm Hash digest
SHA256 b434757b6abb5f955476a3ad55c2d076b8143a0ad215c93bdd1a29dbe590ffe0
MD5 1ea2505c87fa105fca774833e1a009bc
BLAKE2b-256 c3ca9234f63cd028bdf8040115e461fed57ecda1b912639a9b0aea8e378790ac

See more details on using hashes here.

File details

Details for the file cardio_rl-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: cardio_rl-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 13.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.14

File hashes

Hashes for cardio_rl-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 9de7af74ba7287740c1b290af79d4e0f626616dd44c87e67887a452b12289f8e
MD5 e13ae0b35c6a50fa02102e45f06dca2e
BLAKE2b-256 da1f654b73664c8160cb1d62e021a412f8901b3253d21caf77b5188ee950a162

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page