A concurrent wrapper for OpenAI Gym library that runs multiple environments concurrently.

These details have not been verified by PyPI

Project links

Homepage

Project description

agymc

gym

For reinforcement learning and concurrency lovers out there ...

TL;DR

Mostly the same API as gym, except now multiple environments are run.
Envs are run concurrently, which means speedup with time consuming operations such as backprop, render etc..

Intro

This is a concurrent wrapper for OpenAI Gym library that runs multiple environments concurrently, which means running faster in training* without consuming more CPU power.

What exactly is concurrency ?

Maybe you have heard of parallel computing ? When we say we execute things in parallel, we run the program on multiple processors, which offers significant speedup. Concurrency computing has a broader meaning, though. The definition of a concurrent program, is that it is designed not to execute sequentially, and will one day be executed parallelly**. A concurrent program can run on a sigle processor or multiple processors. These tasks may communicate with each other, but have separate private states hidden from others.

Why do we need concurrency on a single processor ?

Some tasks, by nature, takes a lot of time to complete. Downloading a file, for example. Without concurrency, the processor would have to wait for the task to complete before starting to execute the next task. However, with concurrency we could temporarily suspend the current task, and come back later when the task finishes. Without using extra computing power.

So much for introducing concurrency... now, what is gym ?

OpenAI gym, is a Python library that helps research reinforcement learning. Reinforcement learning is a branch from control theory, and focusing mainly on agents interacting with environments. And OpenAI gym provides numerous environments for people to benchmark their beloved reinforcement learning algorithms. For you agents to train in a gym, they say.

Um, so why do we need agymc, do you say ?

Despite its merits, OpenAI gym has one major drawback. It is designed to run one agent on a processor at a time, only. What if you want to run multiple environments on the same processor at a time? Well, it will run, sequentially. Which means slow if you want to train a robot in batches.

Experiments

Using env.render as our bottlenecking operation, runing 200 environments, our versionagymc completes 50 episodes in 4 minutes, while naive gym version takes around twice as long. This is what the madness looks like:

Screenshot_1

Wow, how to use agymc ?

agymc, which combines the power of Python async API and OpenAI gym, hence the name, designed for users to make final Except now the returns are in batches (lists). And except serveral environments are run asynchronously.

Example Usage Code Snippet

import argparse

import agymc

if __name__ == "__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("--num-envs", type=int)
    parser.add_argument("--episodes", type=int)
    parser.add_argument("--render", action="store_true")
    parser.add_argument("--verbose", action="store_true")
    flags = parser.parse_args()

    num_envs = flags.num_envs
    num_episodes = flags.episodes
    render = flags.render
    verbose = flags.verbose

    envs = agync.make("CartPole-v0", num_envs)
    if verbose:
        import tqdm

        iterable = tqdm.tqdm(range(num_episodes))
    else:
        iterable = range(num_episodes)

    for _ in iterable:
        done = list(False for _ in range(num_envs))
        envs.reset()

        while not all(done):
            if render:
                envs.render()
            action = envs.action_space.sample()
            (_, _, done, _) = envs.step(action)
    envs.close()

* When doing pure gym operation such as sampling, stepping, this library runs slower since this is a wrapper for gym. However, for actions that takes a while to execute, such as backprop and update, sending data back and forth, or even rendering, concurrency makes the operations execute much faster than a naive gym implementation

** If you would like to learn more about concurrency patterns, this video is really informative.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.1.1.1.dev0 pre-release

Jan 20, 2020

0.1.1.dev0 pre-release

Jan 20, 2020

0.1.dev0 pre-release

Jan 20, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agymc-0.1.1.1.dev0.tar.gz (5.4 kB view details)

Uploaded Jan 20, 2020 Source

Built Distribution

agymc-0.1.1.1.dev0-py3-none-any.whl (7.7 kB view details)

Uploaded Jan 20, 2020 Python 3

File details

Details for the file agymc-0.1.1.1.dev0.tar.gz.

File metadata

Download URL: agymc-0.1.1.1.dev0.tar.gz
Upload date: Jan 20, 2020
Size: 5.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.0.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.5

File hashes

Hashes for agymc-0.1.1.1.dev0.tar.gz
Algorithm	Hash digest
SHA256	`9724f2b6d893979d21867bde065fd999149eaced8204d309c11a386de19757bb`
MD5	`b4c1d898eba4b7ebc2be23d45006be8b`
BLAKE2b-256	`5bb2654a7ead855c975008ea739b21b70a7a5c629bac57264ee9380824e944d9`

See more details on using hashes here.

File details

Details for the file agymc-0.1.1.1.dev0-py3-none-any.whl.

File metadata

Download URL: agymc-0.1.1.1.dev0-py3-none-any.whl
Upload date: Jan 20, 2020
Size: 7.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/45.0.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.5

File hashes

Hashes for agymc-0.1.1.1.dev0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`be93a5bdc2fd6dc222dc86207f3a41c1940b798b937869be9a30ab22911f5d73`
MD5	`ac29b0829c918c5169c59c4fe7f92f0a`
BLAKE2b-256	`e7c93fc79a89182627a42cb5ca5cf18d5c8f99c9e3b391acd1f5ad36ad343562`

See more details on using hashes here.

agymc 0.1.1.1.dev0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

agymc

TL;DR

Intro

What exactly is concurrency ?

Why do we need concurrency on a single processor ?

So much for introducing concurrency... now, what is gym ?

Um, so why do we need agymc, do you say ?

Experiments

Wow, how to use agymc ?

Example Usage Code Snippet

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes