ModularRL is a Python library for creating and training reinforcement learning agents using the Proximal Policy Optimization (PPO) algorithm. The library is designed to be easily customizable and modular, allowing users to quickly set up and train PPO agents for various environments.
Project description
ModularRL
ModularRL is a Python library for creating and training reinforcement learning agents using the Proximal Policy Optimization (PPO) algorithm. The library is designed to be easily customizable and modular, allowing users to quickly set up and train PPO agents for various environments.
Installation
pip install modular_rl
Features
- Implementation of the PPO algorithm for reinforcement learning
- Customizable agent settings and network architectures
- Modular structure for easy adaptation and extension
Example Usage
You can use the tester.py script provided in the library to create and train an instance of the AgentPPO class with default or modified settings:
import modular_rl.tester as tester
tester.init()
# or
tester.init_modular()
Alternatively, you can create and train an instance of the AgentPPO class directly in your code:
from modular_rl.agents.agent_ppo import AgentPPO
from modular_rl.settings import AgentSettings
def init():
env = AgentPPO(env=None, setting=AgentSettings.default)
env.learn()
init()
To create and train an instance of the AgentPPO class with modified settings, use the following code:
from modular_rl.agents.agent_ppo import AgentPPO
from modular_rl.settings import AgentSettings
def init_modular():
env = AgentPPO(env=None, setting=AgentSettings.default_modular)
env.reset()
env.learn_reset()
action, reward, is_done = env.learn_next()
env.learn_check()
env.update()
env.reset()
env.learn_reset()
# Proceed with the learning manually.
initial_state = env.learn_reset()
action, dist = env.select_action(initial_state)
env.update_step(initial_state, dist, action, -1)
env.learn_check()
env.update()
env.learn_close()
init_modular()
Key Classes
- AgentPPO: The main agent class implementing the PPO algorithm.
- PolicyNetwork: A customizable neural network for the agent's policy.
- ValueNetwork: A customizable neural network for the agent's value function.
- AgentSettings: A configuration class for setting up the PPO agent.
License
MIT License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for modular_rl-0.1.2.dev0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b5df1c34caaf1fc4d436e221a267fd9773f2d4ad344c6efb13c03f155a8376cc |
|
MD5 | 3c398dd943c4c4eb3f3c9b494548cb49 |
|
BLAKE2b-256 | 23ae43ba5e773c06e59bb29acb7fa5501dd31a4e8e56cd885f4824324ebaa3f5 |