Skip to main content

ModularRL is a Python library for creating and training reinforcement learning agents using the Proximal Policy Optimization (PPO) algorithm. The library is designed to be easily customizable and modular, allowing users to quickly set up and train PPO agents for various environments.

Project description

ModularRL

ModularRL is a Python library for creating and training reinforcement learning agents using the Proximal Policy Optimization (PPO) algorithm. The library is designed to be easily customizable and modular, allowing users to quickly set up and train PPO agents for various environments.

Installation

pip install modular_rl

Features

  • Implementation of the PPO algorithm for reinforcement learning
  • Customizable agent settings and network architectures
  • Modular structure for easy adaptation and extension

Example Usage

You can use the tester.py script provided in the library to create and train an instance of the AgentPPO class with default or modified settings:

import modular_rl.tester as tester

tester.init()
# or
tester.init_modular()

Alternatively, you can create and train an instance of the AgentPPO class directly in your code:

from modular_rl.agents.agent_ppo import AgentPPO
from modular_rl.settings import AgentSettings

def init():
    env = AgentPPO(env=None, setting=AgentSettings.default)
    env.learn()

init()

To create and train an instance of the AgentPPO class with modified settings, use the following code:

from modular_rl.agents.agent_ppo import AgentPPO
from modular_rl.settings import AgentSettings

def init_modular():
    env = AgentPPO(env=None, setting=AgentSettings.default_modular)
    env.reset()
    env.learn_reset()
    action, reward, is_done = env.learn_next()
    env.learn_check()
    env.update()

    env.reset()
    env.learn_reset()

    # Proceed with the learning manually.
    '''
    Note: 
    After the action when calling update_step, 
    enter the changed state in the initial_state location
    example, 
    env.update_step(after_action_state, None, action, -1)
    '''

    initial_state = env.learn_reset()
    action, _ = env.select_action(initial_state)
    next_state = env.learn_reset()
    env.update_step(next_state, None, action, -1)

    env.learn_check()
    env.update()

    env.learn_close()


init_modular()

Key Classes

  • AgentPPO: The main agent class implementing the PPO algorithm.
  • PolicyNetwork: A customizable neural network for the agent's policy.
  • ValueNetwork: A customizable neural network for the agent's value function.
  • AgentSettings: A configuration class for setting up the PPO agent.

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

modular_rl-0.1.3a0.tar.gz (10.4 kB view hashes)

Uploaded Source

Built Distribution

modular_rl-0.1.3a0-py3-none-any.whl (10.5 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page