Implementation of modern IRL and imitation learning algorithms.

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Imitation Learning Baseline Implementations

This project aims to provide clean implementations of imitation learning algorithms. Currently we have implementations of Behavioral Cloning, DAgger (with synthetic examples), Adversarial Inverse Reinforcement Learning, and Generative Adversarial Imitation Learning.

Installation:

Installing PyPI release

pip install imitation

Install latest commit

git clone http://github.com/HumanCompatibleAI/imitation
cd imitation
pip install -e .

Optional Mujoco Dependency:

Follow instructions to install mujoco_py v1.5 here.

CLI Quickstart:

We provide several CLI scripts as a front-end to the algorithms implemented in imitation. These use Sacred for configuration and replicability.

From examples/quickstart.sh:

# Train PPO agent on cartpole and collect expert demonstrations. Tensorboard logs saved in `quickstart/rl/`
python -m imitation.scripts.expert_demos with fast cartpole log_dir=quickstart/rl/

# Train GAIL from demonstrations. Tensorboard logs saved in output/ (default log directory).
python -m imitation.scripts.train_adversarial with fast gail cartpole rollout_path=quickstart/rl/rollouts/final.pkl

# Train AIRL from demonstrations. Tensorboard logs saved in output/ (default log directory).
python -m imitation.scripts.train_adversarial with fast airl cartpole rollout_path=quickstart/rl/rollouts/final.pkl

Tips:

Remove the "fast" option from the commands above to allow training run to completion.
python -m imitation.scripts.expert_demos print_config will list Sacred script options. These configuration options are documented in each script's docstrings.

For more information on how to configure Sacred CLI options, see the Sacred docs.

Python Interface Quickstart:

See examples/quickstart.py for an example script that loads CartPole-v1 demonstrations and trains BC, GAIL, and AIRL models on that data.

BC, GAIL, and AIRL also accept as expert_data any Pytorch-style DataLoader that iterates over dictionaries containing observations, actions, and next_observations.

Density reward baseline

We also implement a density-based reward baseline. You can find an example notebook here.

Contributing

See CONTRIBUTING.md.

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

1.0.0

Oct 31, 2023

0.4.0

Jul 17, 2023

0.3.2

Nov 27, 2022

0.3.1

Jul 29, 2022

This version

0.2.0

Oct 23, 2020

0.1.1

Sep 1, 2020

0.1.1a0 pre-release

Sep 1, 2020

0.1.0

May 9, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

imitation-0.2.0.tar.gz (89.3 kB view hashes)

Uploaded Oct 23, 2020 Source

Hashes for imitation-0.2.0.tar.gz

Hashes for imitation-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`9f25d7851be0337b3484aa6f2d4c945b533e36b6c920cab323e647402bd1a379`
MD5	`fa2c6e1940bbcb9bbcfc8c0f037c1e88`
BLAKE2b-256	`1efa4a49fae59a1e190736ca13ba949b7560bba3fd62c176166916f1ca9f4439`