ivy-gym

Fully differentiable reinforcement learning environments, written in Ivy.

These details have not been verified by PyPI

Project links

Project description

Fully differentiable reinforcement learning environments, written in Ivy.

https://github.com/ivy-dl/ivy-dl.github.io/blob/master/img/externally_linked/logos/supported/frameworks.png?raw=true

Overview
Run Through
Optimization Demos
Get Involed

Overview

What is Ivy Gym?

Ivy Gym opens the door for intersectional research between supervised learning (SL), reinforcement learning (RL), and trajectory optimization (TO), by implementing RL environments in a fully differentiable manner.

Specifically, Ivy gym provides differentiable implementations of the classic control tasks from OpenAI Gym, as well as a new Swimmer task, which illustrates the simplicity of creating new tasks using Ivy. The differentiable nature of the environments means that the cumulative reward can be directly optimized for in a supervised manner, without need for reinforcement learning, which is the de facto approach for optimizing cumulative rewards. Check out the docs for more info!

The library is built on top of the Ivy machine learning framework. This means all environments simultaneously support: Jax, Tensorflow, PyTorch, MXNet, and Numpy.

Ivy Libraries

There are a host of derived libraries written in Ivy, in the areas of mechanics, 3D vision, robotics, gym environments, neural memory, pre-trained models + implementations, and builder tools with trainers, data loaders and more. Click on the icons below to learn more!

Quick Start

Ivy gym can be installed like so: pip install ivy-gym

To quickly see the different environments provided, we suggest you check out the demos! We suggest you start by running the script run_through.py, and read the “Run Through” section below which explains this script.

For demos which optimize performance on the different tasks, we suggest you run either optimize_trajectory.py or optimize_policy.py in the optimization demos folder.

Run Through

The different environemnts can be visualized via a simple script, which executes random motion for 250 steps in one of the environments. The script is available in the demos folder, as file run_through.py. First, we select a random backend framework to use for the examples, from the options ivy.jax, ivy.tensorflow, ivy.torch, ivy.mxnet or ivy.numpy, and use this to set the ivy backend framework.

import ivy
from ivy_demo_utils.framework_utils import choose_random_framework
ivy.set_framework(choose_random_framework())

We then select an environment to use and execute 250 random actions, while rendering the environment after each step.

By default, the demos all use the CartPole environment, but this can be changed using the --env argument, choosing from the options CartPole, Pendulum, MountainCar, Reacher or Swimmer.

env = getattr(ivy_gym, env_str)()

env.reset()
ac_dim = env.action_space.shape[0]
for _ in range(250):
    ac = ivy.random_uniform(-1, 1, (ac_dim,))
    env.step(ac)
    env.render()

Here, we briefly discuss each of the five environments, before showing example episodes from a learnt policy network. We use a learnt policy in these visualizations rather than random actions as used in the script, because we find this to be more descriptive for visually explaining each task. We also plot the instantaneous reward corresponding to each frame.

CartPole

For this task, a pole is attached by an un-actuated joint to a cart, which moves along a frictionless track. The system is controlled by applying a force to the cart. A reward is given based on the angle of the pendulum from being upright. Example trajectories are given below.

MountainCar

For this task, a car is on a one-dimensional track, positioned between two “mountains”. The goal is to drive up the mountain on the right. However, the car’s engine is not strong enough to scale the mountain in a single pass. Therefore, the only way to succeed is to drive back and forth to build up momentum. Here, the reward is greater if you spend less energy to reach the goal. Example trajectories are given below.

Pendulum

For this task, an inverted pendulum starts in a random position, and the goal is to swing it up so it stays upright. Again, a reward is given based on the angle of the pendulum from being upright. Example trajectories are given below.

Reacher

For this task, a 2-link robot arm must reach a target position. Reward is given based on the distance of the end effector to the target. Example trajectories are given below.

Swimmer

We implemented this task ourselves, in order to highlight the simplicity of creating new custom environments. For this task, a fish must swim to reach a target 2D positions whilst avoiding sharp obstacles. Reward is given for being close to the target, and negative reward is given for colliding with the sharp objects. Example trajectories are given below.

Optimization Demos

We provide two demo scripts which optimize performance on these tasks in a supervised manner, either via trajectory optimization or policy optimization.

In the case of trajectory optimization, we optimize for a specific starting state of the environment, whereas for policy optimization we train a policy network which is conditioned on the environment state, and the starting state is then randomized between training steps.

Rather than presenting the code here, we show visualizations of the demos. The scripts for these demos can be found in the optimization demos folder.

Trajectory Optimization

In this demo, we show trajectories on each of the five ivy gym environments during the course of trajectory optimization. The optimization iteration is shown in the bottom right, along with the step in the environment.

https://github.com/ivy-dl/ivy-dl.github.io/blob/master/img/externally_linked/ivy_gym/demo_a.png?raw=true

Policy Optimization

In this demo, we show trajectories on each of the five ivy gym environments during the course of policy optimization. The optimization iteration is shown in the bottom right, along with the step in the environment.

https://github.com/ivy-dl/ivy-dl.github.io/blob/master/img/externally_linked/ivy_gym/demo_b.png?raw=true

Get Involed

We hope the differentiable environments in this library are useful to a wide range of machine learning developers. However, there are many more tasks which could be implemented.

If there are any particular tasks you feel are missing, or you would like to implement your own custom task, then we are very happy to accept pull requests!

We look forward to working with the community on expanding and improving the Ivy gym library.

Citation

@article{lenton2021ivy,
  title={Ivy: Unified Machine Learning for Inter-Framework Portability},
  author={Lenton, Daniel and Pardo, Fabio and Falck, Fabian and James, Stephen and Clark, Ronald},
  journal={arXiv preprint arXiv:2102.02886},
  year={2021}
}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

1.1.9

Dec 1, 2021

1.1.8

Dec 1, 2021

1.1.7

Nov 30, 2021

1.1.6

Nov 29, 2021

1.1.5

Jul 26, 2021

1.1.4

Apr 12, 2021

1.1.3

Mar 19, 2021

1.1.2

Mar 3, 2021

1.1.1

Feb 6, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ivy-gym-1.1.9.tar.gz (18.6 kB view details)

Uploaded Dec 1, 2021 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

ivy_gym-1.1.9-py3-none-any.whl (19.6 kB view details)

Uploaded Dec 1, 2021 Python 3

File details

Details for the file ivy-gym-1.1.9.tar.gz.

File metadata

Download URL: ivy-gym-1.1.9.tar.gz
Upload date: Dec 1, 2021
Size: 18.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.7.0 importlib_metadata/4.8.2 pkginfo/1.8.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10

File hashes

Hashes for ivy-gym-1.1.9.tar.gz
Algorithm	Hash digest
SHA256	`2b4b07eecc3b8077124529c92bbd86711ece70674d815ea703308a278f98de4d`
MD5	`6e49b67a04289a41fc3c3d801f52ec36`
BLAKE2b-256	`e3bac5036a34227bc15d99affeca533d4f9e79afbc195519c4e27b85b3fc1ae8`

See more details on using hashes here.

File details

Details for the file ivy_gym-1.1.9-py3-none-any.whl.

File metadata

Download URL: ivy_gym-1.1.9-py3-none-any.whl
Upload date: Dec 1, 2021
Size: 19.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.7.0 importlib_metadata/4.8.2 pkginfo/1.8.1 requests/2.22.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.10

File hashes

Hashes for ivy_gym-1.1.9-py3-none-any.whl
Algorithm	Hash digest
SHA256	`851670a81a271464390b236e9ab2abf9a520d065a77363bf96e60e91a2e02e85`
MD5	`08dae7ee60f9ceaf5fc3ad2f06a35a8b`
BLAKE2b-256	`1074c4920ac3a8826e998a39e6f194e0e797f79f4aeb8f1eb4ad9a9fc72006b2`

See more details on using hashes here.

ivy-gym 1.1.9

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Contents

Overview

Run Through

Optimization Demos

Get Involed

Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes