Skip to main content

All Forms of Reinforcement Learning

Project description

AFRL - All Forms of Reinforcement Learning

The main goal of this project is to provide a framework for reinforcement learning research. The framework is designed to be modular and easy to extend. It is written in Python and built on top of PyTorch. The framework is still in its early stages and is under active development.

Usage

The framework is designed to be modular and easy to extend. The main components of the framework are:

  • Environments: The environments are the tasks that the agent is trying to solve.
  • Agents: The agents are the algorithms that are trying to solve the environment.
  • Trainers: The trainers are the methods the agents are training with.

Here's the list of the trainers along with their current implementation status:

To make it easier to choose the right trainer for the right environment, here's a table that shows the different trainers and the environments that they support:

Method Discrete Action Space Continuous Action Space Single-Agent Multi-Agent Low-Dimensional Obs High-Dimensional Obs
DQN ✔️ ✔️ ✔️
DDQN ✔️ ✔️ ✔️
Dueling DQN ✔️ ✔️ ✔️
Double Dueling DQN ✔️ ✔️ ✔️
DDPG ✔️ ✔️ ✔️ ✔️
SDDPG ✔️ ✔️ ✔️ ✔️
Prioritized Experience Replay ✔️ ✔️ ✔️ ✔️
TRPO ✔️ ✔️ ✔️ ✔️ ✔️
PPO ✔️ ✔️ ✔️ ✔️ ✔️ ✔️
TD-Lambda ✔️ ✔️ ✔️ ✔️
SARSA ✔️ ✔️ ✔️
REINFORCE ✔️ ✔️ ✔️ ✔️ ✔️
Actor-Critic ✔️ ✔️ ✔️ ✔️ ✔️ ✔️
A2C ✔️ ✔️ ✔️ ✔️ ✔️ ✔️
A3C ✔️ ✔️ ✔️ ✔️ ✔️ ✔️
SAC ✔️ ✔️ ✔️ ✔️

Characteristics of Different Environments for RL Algorithms

Here is a list of characteristics of different environments, along with some environment examples that have those characteristics and some generic rules of thumb in order to choose the right method for your environment:

1. Discrete vs Continuous Action Space

  • Examples:
    • Discrete: Tic-Tac-Toe, Grid Worlds
    • Continuous: Robotic arm control, Portfolio management
  • Rules of Thumb:
    • For discrete action spaces, DQN variants and PPO are commonly used.
    • For continuous action spaces, DDPG and SAC offer better performance.

2. Single-Agent vs Multi-Agent

  • Examples:
    • Single-Agent: Mountain Car, Cartpole
    • Multi-Agent: Poker, Multi-robot coordination
  • Rules of Thumb:
    • Single-agent tasks often use DQN, PPO, or DDPG.
    • Multi-agent tasks typically benefit from specialized algorithms like MADDPG, or generic methods like PPO that are adapted for multi-agent scenarios.

3. Low-Dimensional vs High-Dimensional Observation Space

  • Examples:
    • Low-Dimensional: Frozen Lake, Taxi-v3
    • High-Dimensional: Atari games, Visual navigation
  • Rules of Thumb:
    • Low-dimensional problems can be tackled with simpler algorithms like SARSA or TD-Lambda.
    • High-dimensional problems often require methods capable of handling complex function approximation, such as Convolutional Neural Networks (CNNs) in DQN for image-based tasks.

4. Partially Observable vs Fully Observable

  • Examples:
    • Partially Observable: Poker, Hide and Seek
    • Fully Observable: Chess, Go
  • Rules of Thumb:
    • Fully observable environments can make use of simpler methods such as DQN or PPO.
    • Partially observable environments may require methods with memory capabilities like LSTM or GRU incorporated into the algorithm (e.g., DRQN).

5. Sparse vs Dense Reward

  • Examples:
    • Sparse Reward: Maze navigation, robotic grasping
    • Dense Reward: Cartpole, Lunar Lander
  • Rules of Thumb:
    • Dense reward problems can use most algorithms effectively.
    • Sparse reward problems often benefit from algorithms designed for exploration, such as those utilizing curiosity-driven mechanisms or hierarchical methods.

By considering these environment characteristics and rules of thumb, one can make a more informed decision when selecting an appropriate reinforcement learning algorithm for a specific task.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

afrl-0.0.1.4.tar.gz (4.0 kB view details)

Uploaded Source

Built Distribution

afrl-0.0.1.4-py3-none-any.whl (4.0 kB view details)

Uploaded Python 3

File details

Details for the file afrl-0.0.1.4.tar.gz.

File metadata

  • Download URL: afrl-0.0.1.4.tar.gz
  • Upload date:
  • Size: 4.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/34.0 requests-toolbelt/0.9.1 requests/2.25.1 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.12

File hashes

Hashes for afrl-0.0.1.4.tar.gz
Algorithm Hash digest
SHA256 9e9a30e0230129818ec14fecef74bb7b9634ee6dab5e7be4af790e0f2ded8279
MD5 471fd6f275ee6ac582be1a01ba1f6396
BLAKE2b-256 9482cba4cd9385363eadc3d34f246b566a4ac96b228ab194b9b5d0bd36c4d604

See more details on using hashes here.

File details

Details for the file afrl-0.0.1.4-py3-none-any.whl.

File metadata

  • Download URL: afrl-0.0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 4.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 colorama/0.4.4 importlib-metadata/4.6.4 keyring/23.5.0 pkginfo/1.8.2 readme-renderer/34.0 requests-toolbelt/0.9.1 requests/2.25.1 rfc3986/1.5.0 tqdm/4.57.0 urllib3/1.26.5 CPython/3.10.12

File hashes

Hashes for afrl-0.0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 4f8431831b098c4784e2a9b518710c3120eeb04089d322e3fadf91a041813190
MD5 15bd07758c7c6e51967faf6c33ba6cdc
BLAKE2b-256 046115c0cf74fa800ad90713e6456e8974a7f383935db812a7a7a61457ee28ff

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page