sweet-rl·PyPI

The sweetest Reinforcement Learning framework

Project description

Sweet-RL

Why Sweet-RL

It exists dozens of Reinforcement Learning frameworks and algorithms implementations. Yet, most of them suffer of poor modularity and ease of understanding. This is why, I started to implement my own: Sweet-RL. It's so sweet that you can switch from Tensorflow 2.1 to PyTorch 1.4 with a single configuration line:

Getting started

Install sweet-rl

First, create a virtualenv:

python3.x -m venv ~/.virtualenvs/sweet/ 
# or: virtualenv ~/.virtualenvs/sweet/ -p python3
source ~/.virtualenvs/sweet/bin/activate

And then, install project dependancies:

make install # or pip install -e .

First execution

Run a DQN training:

python -m sweet.run --env=CartPole-v0 --algo=dqn --ml=tf

# Parameters:
#   -h, --help            show this help message and exit
#   --env ENV             Environment to play with
#   --algo ALGO           RL agent
#   --ml ML               ML platform (tf or torch)
#   --model MODEL         Model (dense, pi_actor_critic)
#   --timesteps TIMESTEPS
#                         Number of training steps
#   --output OUTPUT       Output directory (eg. './target/')

Custom neural network

If you want to specify your own model instead of default ones, take a look to sweet.agents.dqn.experiments.train_custom_model

Features, algorithms implemented

Algorithms

Algorithm	Implementation status	ML platform
DQN	✔️	TF2, Torch
A2C	✔️	TF2, Torch
PPO	Soon

IO: Logs, model, tensorboard events

Outputs are configurable in training function:

targets: dict = {
        'output_dir': Path('./target/'), # Main directory to store your outputs
        'models_dir': 'models_checkpoints', # Saving models (depending on model_checkpoint_freq)
        'logs_dir': 'logs', # Saving logs (info, debug, errors)
        'tb_dir': 'tb_events' # Saving tensorboard events
}

Models are saved depending on model_checkpoint_freq parameter set in train function.

Benchmark

To reproduce benchmark, execute:

python -m sweet.benchmark.benchmark_runner

Here is an example of benchmark between TF 2.0 and Torch 1.4 with CartPole-v0 environment:
Benchmark RL

Troubleshootings

Tensorflow 2.x doesn't work with Python 3.8 so far, so only Python versions 3.6 and 3.7 are supported
GPU is not used. See https://www.tensorflow.org/install/gpu

History/Author

I started this open-source RL framework in january 2020, at first to take benefit of tensorflow 2.x readability without sacrifying the performance. Besides coding open-source project, i work for both Airbus and IRT Saint-Exupéry on Earth Observation satellites. Our team is focus on mission planning for satellites and Reinforcement Learning is an approach to solve it. Feel free to discuss with me: Adrien HADJ-SALAH @linkedin

You are welcome to participate to this project

RL related topics

What is Reinforcement Learning It is supposed that you have knowledge in RL, if it is not the case, take a look to the spinningup from OpenAI

Project details

Release history Release notifications | RSS feed

This version

0.1

Feb 27, 2020

0.0.1

Jan 31, 2020

0.0.1rc1 pre-release

Feb 2, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sweet-rl-0.1.tar.gz (12.3 kB view details)

Uploaded Feb 27, 2020 Source

File details

Details for the file sweet-rl-0.1.tar.gz.

File metadata

Download URL: sweet-rl-0.1.tar.gz
Upload date: Feb 27, 2020
Size: 12.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.1.0 requests-toolbelt/0.9.1 tqdm/4.42.0 CPython/3.6.8

File hashes

Hashes for sweet-rl-0.1.tar.gz
Algorithm	Hash digest
SHA256	`2cd6885f95b1fd7169cf11f172c102a0e5dd1c86368fc2973eb4b8e15cba7519`
MD5	`a487645da0792f1db1ba4fbf9afcbb6f`
BLAKE2b-256	`fe692f1d5ced0f81fc0fb2cc6196f2033420036ccd10e8b9ee1e94f3df4cfee2`

See more details on using hashes here.

sweet-rl 0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta