YAAF: Yet Another Agents Framework

These details have not been verified by PyPI

Project links

Homepage

Project description

Yet Another Agents Framework

An RL research-oriented framework for agent prototyping and evaluation

Introduction
Installation
Examples
Markov Submodule
Citing
Roadmap
Contributing
Acknowledgments

Introduction

YAAF is a reinforcement learning, research-oriented framework designed for quick agent prototyping and evaluation.

At its core, YAAF follows the assumptions that:

Agents execute actions upon the environment (which yields observations and rewards in return)
Environments follow the interface from OpenAI Gym
Agents follow a clear agent interface
Any deep learning framework can be used for deep rl agents (even though it comes packed with PyTorch tools)

As a simple example, suppose you'd want to evaluate an agent following a random policy on the Space Invaders environment

import gym

from yaaf.agents import RandomAgent
from yaaf.evaluation import AverageEpisodeReturnMetric
from yaaf.execution import EpisodeRunner
from yaaf.visualization import LinePlot

env = gym.make("SpaceInvaders-v0")
agent = RandomAgent(num_actions=env.action_space.n)
metric = AverageEpisodeReturnMetric()
runner = EpisodeRunner(5, agent, env, [metric], render=True).run()

plot = LinePlot("Space Invaders Random Policy", x_label="Episode", y_label="Average Episode Return", num_measurements=5)
plot.add_run("random policy", metric.result())
plot.show()

Quick Disclaimer:

YAAF is not yet another deep reinforcement learning framework.

If you are looking for high-quality implementations of state-of-the-art algorithms, then I suggest the following libraries:

Installation

For the first installation I suggest setting up new Python 3.7 virtual environment

$ python -m venv yaaf_test_environment
$ source yaaf_test_environment/bin/activate
$ pip install --upgrade pip setuptools
$ pip install yaaf  
$ pip install gym[atari] # Optional - Atari2600

Examples

1 - Space Invaders DQN

import gym
from yaaf.environments.wrappers import DeepMindAtari2600Wrapper
from yaaf.agents.dqn import DeepMindAtariDQNAgent
from yaaf.execution import TimestepRunner
from yaaf.evaluation import AverageEpisodeReturnMetric, TotalTimestepsMetric

env = DeepMindAtari2600Wrapper(gym.make("SpaceInvaders-v0"))
agent = DeepMindAtariDQNAgent(num_actions=env.action_space.n)

metrics = [AverageEpisodeReturnMetric(), TotalTimestepsMetric()]
runner = TimestepRunner(1e9, agent, env, metrics, render=True).run()

2 - CartPole DQN

import gym
from yaaf.agents.dqn import MLPDQNAgent
from yaaf.execution import EpisodeRunner
from yaaf.evaluation import AverageEpisodeReturnMetric, TotalTimestepsMetric

env = gym.make("CartPole-v0")
layers = [(64, "relu"), (64, "relu")]
agent = MLPDQNAgent(num_features=env.observation_space.shape[0], num_actions=env.action_space.n, layers=layers)

metrics = [AverageEpisodeReturnMetric(), TotalTimestepsMetric()]
runner = EpisodeRunner(100, agent, env, metrics, render=True).run()

3 - Asynchronous Advantage Actor-Critic on GPU

(my multi-task implementation, requires tensorflow-gpu)

https://research.nvidia.com/publication/reinforcement-learning-through-asynchronous-advantage-actor-critic-gpu

from yaaf.environments.wrappers import NvidiaAtari2600Wrapper
from yaaf.agents.hga3c import HybridGA3CAgent
from yaaf.execution import AsynchronousParallelRunner

num_processes = 8

envs = [NvidiaAtari2600Wrapper("SpaceInvadersDeterministic-v4")
        for _ in range(num_processes)
]

hga3c = HybridGA3CAgent(
            environment_names=[env.spec.id for env in envs],
            environment_actions=[env.action_space.n for env in envs],
            observation_space=envs[0].observation_space.shape
        )

hga3c.start_threads()
trainer = AsynchronousParallelRunner(
    agents=hga3c.workers,
    environments=envs,
    max_episodes=150000,
    render_ids=[0, 1, 2]
)

trainer.start()
while trainer.running:
    continue

hga3c.save(f"hga3c_space_invaders")
hga3c.stop_threads()

4 - CartPole DQN from scratch

import gym
from yaaf.agents.dqn import DQNAgent
from yaaf.agents.dqn.networks import DeepQNetwork

from yaaf.execution import EpisodeRunner
from yaaf.evaluation import AverageEpisodeReturnMetric, TotalTimestepsMetric
from yaaf.models.feature_extraction import MLPFeatureExtractor

# Setup env
env = gym.make("CartPole-v0")
num_features = env.observation_space.shape[0]
num_actions = env.action_space.n

# Setup model
mlp_feature_extractor = MLPFeatureExtractor(num_inputs=num_features, layers=[(64, "relu"), (64, "relu")])
network = DeepQNetwork(feature_extractors=[mlp_feature_extractor],
                       num_actions=num_actions, learning_rate=0.001, optimizer="adam", cuda=True)
# Setup agent
agent = DQNAgent(network, num_actions, 
                 discount_factor=0.95, initial_exploration_steps=1000, final_exploration_rate=0.001)

# Run
metrics = [AverageEpisodeReturnMetric(), TotalTimestepsMetric()]
runner = EpisodeRunner(100, agent, env, metrics, render=True).run()

Markov Sub-module

TODO

Citing the Project

When using YAAF in your projects, cite using:

@misc{yaaf,
  author = {João Ribeiro},
  title = {YAAF - Yet Another Agents Framework},
  year = {2020},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/jmribeiro/yaaf}},
}

Roadmap

Documentation
Code cleanup
More algorithms

Contributing

If you want to contribute to this project, feel free to contact me by e-mail or open an issue.

Acknowledgments

YAAF was developed as a side-project to my research work and its creation was motivated by work done in the project Ad Hoc Teams With Humans And Robots funded by the Air Force Office of Scientific Research, in collaboration with PUC-Rio.

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

1.1.0

Oct 20, 2020

1.0.5

Jul 23, 2020

1.0.3

Jul 1, 2020

1.0.1

Jun 10, 2020

1.0.0

Jun 10, 2020

0.0.2

Jun 1, 2019

0.0.1

Jun 1, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

yaaf-1.1.0.tar.gz (51.8 kB view details)

Uploaded Oct 20, 2020 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

yaaf-1.1.0-py3-none-any.whl (84.4 kB view details)

Uploaded Oct 20, 2020 Python 3

File details

Details for the file yaaf-1.1.0.tar.gz.

File metadata

Download URL: yaaf-1.1.0.tar.gz
Upload date: Oct 20, 2020
Size: 51.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/47.1.1.post20200604 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.7

File hashes

Hashes for yaaf-1.1.0.tar.gz
Algorithm	Hash digest
SHA256	`36e29d09384999be2727b80b19e1a3acaeaa8d92918f982721e4c67432f101c0`
MD5	`8e4c74ec97287e6614e1d9e8d49c99f7`
BLAKE2b-256	`c962d6f1aa4e860d7843bdf41f2c6c6e367fce495c38dcc24963550411e7f1e0`

See more details on using hashes here.

File details

Details for the file yaaf-1.1.0-py3-none-any.whl.

File metadata

Download URL: yaaf-1.1.0-py3-none-any.whl
Upload date: Oct 20, 2020
Size: 84.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/47.1.1.post20200604 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.7

File hashes

Hashes for yaaf-1.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`1d5acf677821d130a6b51032651ed456079890982611c00a6e196488edc5a7f6`
MD5	`cc804e3105bd7af2cce5aabd0146efe1`
BLAKE2b-256	`0fc38fb6cd18bef136460a162ac338927c91bc8ce3c678260a43a495211f83cc`

See more details on using hashes here.

yaaf 1.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Yet Another Agents Framework

Introduction

Installation

Examples

1 - Space Invaders DQN

2 - CartPole DQN

3 - Asynchronous Advantage Actor-Critic on GPU

(my multi-task implementation, requires tensorflow-gpu)

4 - CartPole DQN from scratch

Markov Sub-module

Citing the Project

Roadmap

Contributing

Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes