A reinforcement learning module
Project description
Reinforcement
The Reinforcement module aims to provide simple implementations for various reinforcement learning algorithms. The module tries to be agnostic about its use cases, but implements different solutions for policy selection, value- and q-function approximations as well as different agents for reinforcement learning algorithms.
The project is in its early stage and currently only provides an n-step temporal difference learning agent. The main purpose of the project is to facilitate my own understanding of reinforcement learning, with no particular application in mind.
Module structure
The module is organises in 3 main parts. Policies, reward functions and agents, each providing necessary components to construct a reinforcement learning agent. Components should have a low dependency amongst each other and share a simple common interface to facilitate modular construction of agents.
Agents
This module contains the actual agents implementing the reinforcement learning algorithm using a policy component and a reward function component. Currently only a n-step temporal difference agent is implemented.
Policies
This module contains action selection policies used by reinforcement learning agents. Available policies: epsilon greedy; normalized epsilon greedy.
Reward Functions
This module contains implementations of reward functions, which are used by reinforcement learning agents. Available reward functions: value table, q table, q neural network
Models
Reinforcement also contains neural network implementation which can be used as non-linear reward function approximiations. Currently there are 2 regression models implemented, one using Keras and one using pure Tensorflow.
Architecture
This software is crafted using Test Driven Development and tries to adhere to the SOLID principle as far as it lies in the abilities of the author.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for reinforcement-1.0.6-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 67dec6b6a212dcfc6d502c386263e88d8a4861de86fd21983303368c93e664fe |
|
MD5 | dc45dbc4e94bdc74ad9723f22cdc3be0 |
|
BLAKE2b-256 | efaeaa0d951325e0b04082042307b4dd1bc509350af11b7ff497bb6229627350 |