Performances of Reinforcement Learning Agents
Project description
rldb
Database of RL algorithms
Atari Space Invaders Scores | MuJoCo Walker2d Scores |
---|---|
Examples
You can use rldb.find_all({})
to retrieve all existing entries in rldb
.
import rldb
all_entries = rldb.find_all({})
You can also filter entries by specifying key-value pairs that the entry must match:
import rldb
dqn_entries = rldb.find_all({'algo-nickname': 'DQN'})
breakout_noop_entries = rldb.find_all({
'env-title': 'atari-breakout',
'env-variant': 'No-op start',
})
You can also use rldbl.find_one(filter_dict)
to find one entry that matches the key-value pair specified in filter_dict
:
import rldb
import pprint
entry = rldb.find_one({
'env-title': 'atari-pong',
'algo-title': 'Human',
})
pprint.pprint(entry)
Output
{
'algo-nickname': 'Human',
'algo-title': 'Human',
'env-title': 'atari-pong',
'env-variant': 'No-op start',
'score': 14.6,
'source-arxiv-id': '1511.06581',
'source-arxiv-version': 3,
'source-authors': [ 'Ziyu Wang',
'Tom Schaul',
'Matteo Hessel',
'Hado van Hasselt',
'Marc Lanctot',
'Nando de Freitas'],
'source-bibtex': '@article{DBLP:journals/corr/WangFL15,\n'
' author = {Ziyu Wang and\n'
' Nando de Freitas and\n'
' Marc Lanctot},\n'
' title = {Dueling Network Architectures for Deep '
'Reinforcement Learning},\n'
' journal = {CoRR},\n'
' volume = {abs/1511.06581},\n'
' year = {2015},\n'
' url = {http://arxiv.org/abs/1511.06581},\n'
' archivePrefix = {arXiv},\n'
' eprint = {1511.06581},\n'
' timestamp = {Mon, 13 Aug 2018 16:48:17 +0200},\n'
' biburl = '
'{https://dblp.org/rec/bib/journals/corr/WangFL15},\n'
' bibsource = {dblp computer science bibliography, '
'https://dblp.org}\n'
'}',
'source-nickname': 'DuDQN',
'source-title': 'Dueling Network Architectures for Deep Reinforcement '
'Learning'
}
Entry Structure
Here is the format of every entry:
{
# BASICS
"source-title": "",
"source-nickname": "",
"source-authors": [],
# MISC.
"source-bibtex": "",
# ALGORITHM
"algo-title": "",
"algo-nickname": "",
"algo-source-title": "",
# SCORE
"env-title": "",
"score": 0,
}
source-title
is the full title of the source of the score: it can be the title of the paper or GitHub repository title.source-nickname
is a popular nickname or acronym for that title if it exists, otherwise it is the same assource-title
.source-authors
are a list of authors or contributors.source-bibtex
is a BibTeX-format citation.algo-title
is the full title of the algorithm used.algo-nickname
is the nickname or acronym for that algorithm if it exists, otherwise it is the same asalgo-nickname
.algo-source-title
is the title of the source of the algorithm. It can and often is different fromsource-title
.
For example, the Space Invaders score of Asynchronous Advantage Actor Critic (A3C) algorithm in the Noisy Networks for Exploration (NoisyNet) paper is represented by the following entry:
{
# BASICS
"source-title": "Noisy Networks for Exploration",
"source-nickname": "NoisyNet",
"source-authors": [
"Meire Fortunato",
"Mohammad Gheshlaghi Azar",
"Bilal Piot",
"Jacob Menick",
"Ian Osband",
"Alex Graves",
"Vlad Mnih",
"Remi Munos",
"Demis Hassabis",
"Olivier Pietquin",
"Charles Blundell",
"Shane Legg",
],
# ARXIV
"source-arxiv-id": "1706.10295",
"source-arxiv-version": 2,
# MISC.
"source-bibtex": """
@article{DBLP:journals/corr/FortunatoAPMOGM17,
author = {Meire Fortunato and
Mohammad Gheshlaghi Azar and
Bilal Piot and
Jacob Menick and
Ian Osband and
Alex Graves and
Vlad Mnih and
R{\'{e}}mi Munos and
Demis Hassabis and
Olivier Pietquin and
Charles Blundell and
Shane Legg},
title = {Noisy Networks for Exploration},
journal = {CoRR},
volume = {abs/1706.10295},
year = {2017},
url = {http://arxiv.org/abs/1706.10295},
archivePrefix = {arXiv},
eprint = {1706.10295},
timestamp = {Mon, 13 Aug 2018 16:46:11 +0200},
biburl = {https://dblp.org/rec/bib/journals/corr/FortunatoAPMOGM17},
bibsource = {dblp computer science bibliography, https://dblp.org}
}""",
# ALGORITHM
"algo-title": "Asynchronous Advantage Actor Critic",
"algo-nickname": "A3C",
"algo-source-title": "Asynchronous Methods for Deep Reinforcement Learning",
# HYPERPARAMETERS
"algo-frames": 320 * 1000 * 1000, # Number of frames
# SCORE
"env-title": "atari-space-invaders",
"env-variant": "No-op start",
"score": 1034,
"stddev": 49,
}
Note that, as shown here, the entry can contain additional information.
Sources
Papers
Deep Q-Networks
- Playing Atari with Deep Reinforcement Learning (Mnih et al., 2013)
- Human-level control through deep reinforcement learning (Mnih et al., 2015)
- Deep Recurrent Q-Learning for Partially Observable MDPs (Hausknecht and Stone, 2015)
- Massively Parallel Methods for Deep Reinforcement Learning (Nair et al., 2015)
- Deep Reinforcement Learning with Double Q-learning (Hasselt et al., 2015)
- Prioritized Experience Replay (Schaul et al., 2015)
- Dueling Network Architectures for Deep Reinforcement Learning (Wang et al., 2015)
- Noisy Networks for Exploration (Fortunato et al., 2017)
- A Distributional Perspective on Reinforcement Learning (Bellemare et al., 2017)
- Rainbow: Combining Improvements in Deep Reinforcement Learning (Hessel et al., 2017)
- Distributional Reinforcement Learning with Quantile Regression (Dabney et al., 2017)
- Implicit Quantile Networks for Distributional Reinforcement Learning (Dabney et al., 2018)
Policy Gradients
- Asynchronous Methods for Deep Reinforcement Learning (Mnih et al., 2016)
- Trust Region Policy Optimization (Schulman et al., 2015)
- Proximal Policy Optimization Algorithms (Schulman et al., 2017)
- Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (Wu et al., 2017)
- Addressing Function Approximation Error in Actor-Critic Methods (Fujimoto et al., 2018)
- IMPALA: Importance Weighted Actor-Learner Architectures (Espeholt et al., 2018)
- The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning (Gruslys et al., 2017)
Exploration
Misc.
Repositories
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file rldb-0.0.0.tar.gz
.
File metadata
- Download URL: rldb-0.0.0.tar.gz
- Upload date:
- Size: 59.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.18.4 setuptools/39.2.0 requests-toolbelt/0.8.0 tqdm/4.23.4 CPython/3.6.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 39a7499a04882800c2421a30af2efebb2854c69b75629ff5058084c4766d0817 |
|
MD5 | f662d95c7f10bf62184972bd8acfbbb7 |
|
BLAKE2b-256 | 6d59ece9626bc9e433562490fefe956c09f097bd4b142b37475c21e7af3dd85e |
File details
Details for the file rldb-0.0.0-py3-none-any.whl
.
File metadata
- Download URL: rldb-0.0.0-py3-none-any.whl
- Upload date:
- Size: 146.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.18.4 setuptools/39.2.0 requests-toolbelt/0.8.0 tqdm/4.23.4 CPython/3.6.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ec2ebf98140759edf515c7f2274b802ea4c12483dd109f4a814ba04aa9a51cfb |
|
MD5 | 3c33243225adc13b22171d58b7a51343 |
|
BLAKE2b-256 | c7ab95db1d80c06b53b232569953cf12baa4ba3f3f60666124ae98583b713be2 |