Cardio RL. In development...
Project description
GymCardio
Cardio offers to replace much of the boiler plate code in deep reinforcement learning algorithms; namely the data gathering and environment interaction, but also providing pre-written policies and replay buffers, all while aiming to do so in a modular fashion and allow users to implement their own algorithms or improvements to existing algorithms.
The purpose of this library was initially to speed up my own self-implemented versions of algorithms and reducing the code overhead while also reducing the actual number of lines written per script for each algorithm. You'll often find that the actual algorithm details often corresponds to very little code and handling of the environment and buffer causes even simple implementations to feel somewhat bloated. Thus cardio hopes to allow for simple, easy to read, one file implementations. In an effort to showcase the library I have included a number of 'stubs' that are just that.
This is still heavily a work in progress and even many of the (poorly organised) stubs do not work with the current versions of the runner. Going forward I will be chipping away at streamlining, organising and documenting this repository when I get the chance!
Cardio basics
This section will be overhauled at a later date, but now the gist of Cardio is the Runner class that gives you a simple interface that wraps the environment and then with one method, will step through the environment, collect transitions, and process them in your defined way. The runner supports n-step transitions, custom policies and custom processing. Currently the runner class is biased with off-policy algorithms in mind (altough fully supporting on-policy approaches too) but going forward it will be better balanced between the two.
Immediate to do list
-
Create package!
- basic implementation done, need to do PyPi and look into further improvements
-
Implement replay buffer class and move IET work into Cardio ecosystem
-
Remove warmup method in gatherer and make it a special call of the rollout method
Vectorised env work
-
Align VectorCollector with Collector
- Revisit A2C implementation
- Make Logger compatible with VectorCollector
- Learning doesnt seem to line up with stable baselines3 need to debug all aspects (collector, value estimation and logger)
-
Makes sure vector collector work as intended for off-policy methods and n-step collector work as intended for on-policy methods etc.
Lower priority
-
Offline gatherer
- on pause until mujoco sorted
-
Minor refactor to gatherer and runner, add default arg values, careful consideration needed
- mostly done, just some final decisions to make
- change collector name to gatherer, idk why its different
-
Move agent stubs outside of src and using package instead of src
-
Sort policies better, i.e. discrete, continuous
-
Benchmark each implementation wrt. SB3 (change logging to timestep based first though)
Completed
-
Implement multibatch sampling for off-policy runner
-
Add episode length to logger and use the same names as SB3 for easy integration!
-
Parallel gatherer
-
Change logging from episodic to timestep based
- include window and log_interval arguments to gatherer
-
Implement 'step_for' function in Collector!
-
Create dummy env for testing logging, collection and sampling!
-
Implement 'reduce' argument for n-step learning that returns unsqueezed vectors (for DRQN)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file cardio_rl-0.0.5.tar.gz
.
File metadata
- Download URL: cardio_rl-0.0.5.tar.gz
- Upload date:
- Size: 12.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b434757b6abb5f955476a3ad55c2d076b8143a0ad215c93bdd1a29dbe590ffe0 |
|
MD5 | 1ea2505c87fa105fca774833e1a009bc |
|
BLAKE2b-256 | c3ca9234f63cd028bdf8040115e461fed57ecda1b912639a9b0aea8e378790ac |
File details
Details for the file cardio_rl-0.0.5-py3-none-any.whl
.
File metadata
- Download URL: cardio_rl-0.0.5-py3-none-any.whl
- Upload date:
- Size: 13.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.10.14
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9de7af74ba7287740c1b290af79d4e0f626616dd44c87e67887a452b12289f8e |
|
MD5 | e13ae0b35c6a50fa02102e45f06dca2e |
|
BLAKE2b-256 | da1f654b73664c8160cb1d62e021a412f8901b3253d21caf77b5188ee950a162 |