Skip to main content

Parallel Reinforcement Learning library

Project description

PRLearn

PRLearn is a Python library for Parallel Reinforcement Learning. It leverages multiprocessing to accelerate experience collection and agent training, making RL experimentation faster and more efficient.

Key Features

  • Flexible architecture: Easily extendable with custom agents, environments, and combiners.
  • Minimal dependencies: Only Python 3.11+ and (optionally) multiprocess.
  • Parallel data collection and training: Reduce training time via multiprocessing.
  • Agent combination: Multiple strategies for aggregating agents (statistical, random, fixed, etc.).
  • Flexible scheduling: Control training stages via ProcessActionScheduler.

Installation

pip install prlearn

Or with multiprocess support:

pip install prlearn[multiprocess]

Quick Start

Define Your Agent

from prlearn import Agent, Experience
from typing import Any, Dict, Tuple

class MyAgent(Agent):
    def action(self, state: Tuple[Any, Dict[str, Any]]) -> Any:
        observation, info = state
        # Action selection logic
        pass
    def train(self, experience: Experience):
        obs, actions, rewards, terminated, truncated, info = experience.get()
        # Training logic
        pass

Use Trainer for Parallel Training

import gymnasium as gym
from prlearn import Trainer
from prlearn.collection.agent_combiners import FixedStatAgentCombiner

env = gym.make("LunarLander-v2")
agent = MyAgent()

trainer = Trainer(
    agent=agent,
    env=env,
    n_workers=4,
    schedule=[
        ("finish", 1000, "episodes"),
        ("train_agent", 10, "episodes"),
    ],
    mode="parallel_learning",  # optional
    sync_mode="sync",          # optional
    combiner=FixedStatAgentCombiner("mean_reward"),  # optional
)

agent, result = trainer.run()

Custom Environment

from prlearn import Environment
from typing import Any, Dict, Tuple

class MyEnv(Environment):
    def reset(self) -> Tuple[Any, Dict[str, Any]]:
        # Reset logic
        return [[1, 2], [3, 4]], {"info": "description"}
    def step(self, action: Any) -> Tuple[Any, Any, bool, bool, Dict[str, Any]]:
        # Step logic
        return [[1, 2], [3, 4]], 1, False, False, {"info": "description"}

See more usage examples in docs/examples.md

Extending

  • Custom agent: Inherit from Agent, implement action and train methods.
  • Custom environment: Inherit from Environment, implement reset and step methods.
  • Custom combiner: Inherit from AgentCombiner, implement the combine method.

Testing

To run tests:

pytest tests/

License

MIT License. See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

prlearn-0.0.4.tar.gz (14.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

prlearn-0.0.4-py3-none-any.whl (19.5 kB view details)

Uploaded Python 3

File details

Details for the file prlearn-0.0.4.tar.gz.

File metadata

  • Download URL: prlearn-0.0.4.tar.gz
  • Upload date:
  • Size: 14.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.12.10 Darwin/24.4.0

File hashes

Hashes for prlearn-0.0.4.tar.gz
Algorithm Hash digest
SHA256 bc1998acc65bb2f45bd383a6dda44722d758b91ba4fd088999a4019dba6d6e1e
MD5 bacaf391a7b71b353cd8ea17a6fa4334
BLAKE2b-256 8362d0f674ab4fb756094a530e135473cab861654ac8c5d564a2859285f09f92

See more details on using hashes here.

File details

Details for the file prlearn-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: prlearn-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 19.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.12.10 Darwin/24.4.0

File hashes

Hashes for prlearn-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 547583f35f21a3fe6f056bfd8af9f490dc29d8f70d4645819cff011f69dc0b14
MD5 6d63f7646969cb86ea9bc295e014515c
BLAKE2b-256 fd33e42adccec08fbc2a9e05de59cdf13626bc72655e166fb6a7048a1efa71fa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page