Cleanest Deep Reinforcement Learning Implementation Based on Web MVC

These details have not been verified by PyPI

Project links

Homepage

Project description

mvc-drl

Clean deep reinforcement learning codes based on Web MVC architecture with complete unit tests

motivation

Implementing deep reinforcement learning algorithms is easy to make up messy codes because interaction loop between an environment and an agent requires a lot of dependencies among classes. Even deep learning requires special skills to build clean codes.

To think out of the box, Web engineers spent years on studying MVC (model-view-controller) architecture to build systems with tidy codes to handle interaction between Web and users. Here, I found that this MVC architecture is very useful insight even for deep reinforcement learning implementation. MVC provides a direction to an architecture with less dependencies, which would be nicer for unit testing.

installation

nvidia-docker

You can use docker to setup and run experiments.

$ ./scripts/build.sh

Once you built the container, you can start a container with nvidia runtime via ./scripts/up.sh.

$ ./scripts/up.sh
root@a84ab59aa668:/home/app#  ls
Dockerfile  README.md    example.confing.json  graphs            mvc      scripts  tests
LICENSE     examples     logs                  requirements.txt  test.sh  tools
root@a84ab59aa668:/home/app#

manual

You need to install packages written in requirements.txt and tensorflow.

$ pip install -r requirements.txt
$ pip install tensorflow-gpu tensorflow-probability-gpu
# if you run example scripts
$ pip install pybullet roboschool

If you have a problem of installing tensorflow probability, check tensorflow version.

install as a library

This repository is also available on PyPI. You can implement extra algorithms built on top of mvc-drl.

$ pip install mvc

:warning: This reposiotry is under development so that interfaces might be frequently changed.

algorithms

For academic usage, we provide baseline implementations that you might need to compare.

Proximal Policy Optimization
Deep Deterministic Policy Gradients
Soft Actor-Critic

Ant performance

Each point represents an average evaluation reward of 10 episodes. Pretty much same performance has been achieved as a paper of Soft Actor-Critic.

PPO

$ python -m examples.ppo --env Ant-v2

ppo

DDPG

$ python -m examples.ddpg --env Ant-v2

ddpg

SAC

$ python -m examples.sac --env Ant-v2 --reward-scale 5

sac

comparison

log visualization

All logging data is saved under logs directory as csv files and visualization tool data. Use --log-adapter option in example codes to switch tensorboard and visdom as visualization (default: tensorboard).

tensorboard

$ tensorboard --logdir logs

visdom

To use visdom, you need to fill host information of a visdom server.

$ mv example.config.json config.json
$ vim config.json # fill visdom section

Before running experiments, start the visdom server.

$ visdom

matplotlib

You can visualize with tools/plot_csv.py by directly pointing to csv files.

$ python tools/plot_csv.py <path to csv> <path to csv> ...

By default, legends are set with paths of files. If you want to set them manually, use label option.

$ python tools/plot_csv.py --label=experiment1 --label=experiment2 <path to csv> <path to csv>

unit testing

To gurantee code quality, all functions and classes including neural networks must have unit tests.

Following command runs all unit tests under tests directory.

$ ./test.sh

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

1.1.2

Apr 25, 2019

1.1.1

Mar 8, 2019

1.1.0

Feb 14, 2019

1.0.0

Feb 13, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mvc-1.1.2-py3-none-any.whl (41.3 kB view details)

Uploaded Apr 25, 2019 Python 3

File details

Details for the file mvc-1.1.2-py3-none-any.whl.

File metadata

Download URL: mvc-1.1.2-py3-none-any.whl
Upload date: Apr 25, 2019
Size: 41.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.30.0 CPython/3.6.1

File hashes

Hashes for mvc-1.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`78d8adcb16926761ff30fd7c02b83ac7fc48d04b584ccff271724e7af91fdc83`
MD5	`eceb66b2c7043587aeae6a6602d817ab`
BLAKE2b-256	`1f42082756920c67c20d248c72f655b7f30c02f03f2bc36c053929be35d577d2`

See more details on using hashes here.

mvc 1.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

mvc-drl

motivation

installation

nvidia-docker

manual

install as a library

algorithms

Ant performance

PPO

DDPG

SAC

comparison

log visualization

tensorboard

visdom

matplotlib

unit testing

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes