A Library for Deep Reinforcement Learning

Project description

JoyRL

JoyRL is a parallel reinforcement learning library based on PyTorch and Ray. Unlike existing RL libraries, JoyRL is helping users to release the burden of implementing algorithms with tough details, unfriendly APIs, and etc. JoyRL is designed for users to train and test RL algorithms with only hyperparameters configuration, which is mush easier for beginners to learn and use. Also, JoyRL supports plenties of state-of-art RL algorithms including RLHF(core of ChatGPT)(See algorithms below). JoyRL provides a modularized framework for users as well to customize their own algorithms and environments.

Install

# you need to install Anaconda first
conda create -n joyrl python=3.8
conda activate joyrl
pip install -U joyrl

Torch install:

# CPU only
conda install pytorch==1.10.0 torchvision==0.11.0 torchaudio==0.10.0 cpuonly -c pytorch
# GPU 
conda install pytorch==1.10.0 torchvision==0.11.0 torchaudio==0.10.0 cudatoolkit=11.3 -c pytorch -c conda-forge
# if network error, then GPU with mirrors
pip install torch==1.10.0+cu113 torchvision==0.11.0+cu113 torchaudio==0.10.0 --extra-index-url https://download.pytorch.org/whl/cu113

Usage

the following presents a demo to use joyrl, you donot need to care about complicated details of code. All your need is just to set hyper parameters including GeneralConfig() and AlgoConfig(), which is also shown in examples folder, and well trained results are shown in the benchmarks folder as well.

import joyrl
class GeneralConfig():
    def __init__(self) -> None:
        self.env_name = "CartPole-v1" # name of environment
        self.algo_name = "DQN" # name of algorithm
        self.mode = "train" # train or test
        self.seed = 0 # random seed
        self.device = "cpu" # device to use
        self.train_eps = 100 # number of episodes for training
        self.test_eps = 20 # number of episodes for testing
        self.eval_eps = 10 # number of episodes for evaluation
        self.eval_per_episode = 5 # evaluation per episode
        self.max_steps = 200 # max steps for each episode
        self.load_checkpoint = False
        self.load_path = "tasks" # path to load model
        self.show_fig = False # show figure or not
        self.save_fig = True # save figure or not

class AlgoConfig():
    def __init__(self) -> None:
        # set epsilon_start=epsilon_end can obtain fixed epsilon=epsilon_end
        self.epsilon_start = 0.95  # epsilon start value
        self.epsilon_end = 0.01  # epsilon end value
        self.epsilon_decay = 500  # epsilon decay rate
        self.gamma = 0.95  # discount factor
        self.lr = 0.0001  # learning rate
        self.buffer_size = 100000  # size of replay buffer
        self.batch_size = 64  # batch size
        self.target_update = 4  # target network update frequency
        self.value_layers = [
            {'layer_type': 'linear', 'layer_dim': ['n_states', 256],
             'activation': 'relu'},
            {'layer_type': 'linear', 'layer_dim': [256, 256],
             'activation': 'relu'},
            {'layer_type': 'linear', 'layer_dim': [256, 'n_actions'],
             'activation': 'none'}]
if __name__ == "__main__":
    general_cfg = GeneralConfig()
    algo_cfg = AlgoConfig()
    joyrl.run(general_cfg,algo_cfg)

Documentation

More tutorials and API documentation are hosted on https://datawhalechina.github.io/joyrl/

Algorithms

Name	Reference	Author	Notes
DQN	DQN Paper	johnjim0816

Project details

Release history Release notifications | RSS feed

0.6.7.5

Aug 20, 2024

0.6.7.4

Jul 31, 2024

0.6.7.3

Jul 28, 2024

0.6.7.2

Jul 28, 2024

0.6.7.1

Jul 21, 2024

0.6.7

Jul 21, 2024

0.6.6

Jul 7, 2024

0.6.5.4

Jun 18, 2024

0.6.5.3

Jun 18, 2024

0.6.5.2

Jun 17, 2024

0.6.5.1

Jun 16, 2024

0.6.5

Jun 16, 2024

0.6.4.2

Jun 14, 2024

0.6.4.1

Jun 14, 2024

0.6.4

Jun 14, 2024

0.6.3.2

Jun 14, 2024

0.6.3.1

Jun 14, 2024

0.6.3

Jun 14, 2024

0.6.2.7

Jun 13, 2024

0.6.2.6

Jun 13, 2024

0.6.2.5

Jun 13, 2024

0.6.2.4

Jun 12, 2024

0.6.2.3

Jun 12, 2024

0.6.2.2

Jun 11, 2024

0.6.2.1

Jun 11, 2024

0.6.2

Jun 10, 2024

0.6.1.4

Jun 3, 2024

0.6.1.3

Jun 2, 2024

0.6.1.2

Jun 2, 2024

0.6.1.1

Jun 1, 2024

0.6.1

Jun 1, 2024

0.6.0.2

Jun 1, 2024

0.6.0.1

Jun 1, 2024

0.6.0

Jun 1, 2024

0.5.2

Jan 27, 2024

0.5.1.1

Jan 26, 2024

0.4.7

Dec 25, 2023

0.4.6.4

Dec 24, 2023

0.4.6.3

Dec 24, 2023

0.4.6.2

Dec 24, 2023

0.4.6.1

Dec 24, 2023

0.4.6

Dec 24, 2023

0.4.5

Dec 24, 2023

0.4.4.1

Dec 24, 2023

0.4.4

Dec 24, 2023

0.4.3

Dec 23, 2023

0.4.2

Dec 23, 2023

0.4.1

Dec 22, 2023

0.4.0

Dec 22, 2023

This version

0.3.0

May 30, 2023

0.2.14

May 30, 2023

0.2.13

May 30, 2023

0.2.12

May 30, 2023

0.2.11

May 28, 2023

0.2.10

May 28, 2023

0.2.9

May 28, 2023

0.2.8

May 28, 2023

0.2.7

May 28, 2023

0.2.6

May 28, 2023

0.2.5

May 28, 2023

0.2.4

May 28, 2023

0.2.3

May 28, 2023

0.2.2

May 28, 2023

0.2.1

May 28, 2023

0.2.0

May 28, 2023

0.1.18

Mar 17, 2023

0.1.17

Mar 17, 2023

0.1.16

Mar 17, 2023

0.1.15

Mar 17, 2023

0.1.14

Mar 17, 2023

0.1.13

Nov 20, 2022

0.1.12

Nov 20, 2022

0.1.11

Nov 19, 2022

0.1.10

Nov 19, 2022

0.1.9

Nov 19, 2022

0.1.8

Nov 19, 2022

0.1.7

Nov 19, 2022

0.1.6

Nov 19, 2022

0.1.5

Nov 19, 2022

0.1.4

Nov 19, 2022

0.1.3

Nov 19, 2022

0.1.2

Nov 19, 2022

0.1.1

Nov 19, 2022

0.1.0

Nov 19, 2022

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

joyrl-0.3.0.tar.gz (51.9 kB view hashes)

Uploaded May 30, 2023 Source

Built Distribution

joyrl-0.3.0-py3-none-any.whl (68.9 kB view hashes)

Uploaded May 30, 2023 Python 3

Hashes for joyrl-0.3.0.tar.gz

Hashes for joyrl-0.3.0.tar.gz
Algorithm	Hash digest
SHA256	`99309a1d9a203439a95a7837af48e22af27fd223e67ff9275a7124f60ad87f19`
MD5	`a734be5c1038ead596b2fd6162944e79`
BLAKE2b-256	`49f021dbc61f239290867d5372246085d6206856bfcaa8396d928731de0f251e`

Hashes for joyrl-0.3.0-py3-none-any.whl

Hashes for joyrl-0.3.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2e860f06f6f362e715ee889b49e61b27dc6f3394de65170c0781f04e302be5bd`
MD5	`ca82b95caf88980fb060f4a007f4059a`
BLAKE2b-256	`981f9db543312c0d05877ad56501d81f0c8459ea14ead176fdbc5bf9cd75fc89`