Skip to main content

Reinforcment Learning Environments

Project description

Reinforcement Learning Environments

This package is to simplify life for doing RL experiments by providing easily generatable RL environments that can be used to test out RL algorithms.

This is still work in progress, however, hopefully this will serve as a useful feature for exact RL experiments in a reproducible, light-weight and scientific manner.


Getting Started.

Installation

Installing with PyPi

pip3 install rlenvs

Installing from source

git clone https://github.com/ai-nikolai/rl-environments
cd rl-environments
pip3 install -e .

Examples:

Bandit

from rlenvs.bandits import MultiarmBernoulliBandit

env = MultiarmBernoulliBandit(arms=5)

reward, observation, is_finished, internal_state = env.step(0) #picks arm 0

Tree MDP

from rlenvs.mdps import BalancedDenseTreeDeterministicMDP

env = BalancedDenseTreeDeterministicMDP(branching=3, depth=5) #creates a tree with 3 choices each turn and a total of 5 turns.

reward, observation, is_finished, internal_state = env.step(3) #picks arm 0

This is how such an environment would look like: BalancedTree


Documentation:

Overview:

Overall, this package provides environments, whose API is quite similar to the environments provided by Deepmind and OpenAI. (for interoperability.)

That is the interface provided by every environment:

class BaseEnvironment(object):
    """
    Implements the following methods inspired by both OpenAI gym and Deepmind Bsuite (dm_env).
    :initialise() -> observation, resets and initialises the environment and returns first observation:
    :step(action) -> reward(float), observation(Optional[Any]), is_finished(bool), state(Optional[Any]):
    :reset() -> "resets the environement":
    :undo() -> "goes to the previous state of the environment" reward, observation, is_finished(bool), sate(Optional[Any]):
    :go_to_state(state) -> "goes to a specific state of the environment" is_finished(bool):
    :seed(int) -> "sets the seed":
    :render() -> "renders the environment":
    :get_specs() -> returns the custom specs of the environment:
    """

Troubleshooting / FAQs:

Requirements: (What are the requirements):

In the future this will hopefully be configurable

python >= 3.6
networkx
graphviz
...

Copyright (C) - Nikolai Rozanov 2020-Present

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rlenvs-0.0.0.2.tar.gz (7.5 kB view details)

Uploaded Source

Built Distribution

rlenvs-0.0.0.2-py3-none-any.whl (10.2 kB view details)

Uploaded Python 3

File details

Details for the file rlenvs-0.0.0.2.tar.gz.

File metadata

  • Download URL: rlenvs-0.0.0.2.tar.gz
  • Upload date:
  • Size: 7.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.6.5

File hashes

Hashes for rlenvs-0.0.0.2.tar.gz
Algorithm Hash digest
SHA256 34108615a2e1f18d491aa695f6f2cd129b01ac61c86af512c7e7b3c3d2219523
MD5 ca5941a44e57eaf5cc4c10ec493510e5
BLAKE2b-256 37f0390493f1e130be4fa29484d4266d9dbd641599e2ffdf69adc75eca442fec

See more details on using hashes here.

File details

Details for the file rlenvs-0.0.0.2-py3-none-any.whl.

File metadata

  • Download URL: rlenvs-0.0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 10.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.4.2 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.8.0 tqdm/4.26.0 CPython/3.6.5

File hashes

Hashes for rlenvs-0.0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 5d368f28c90ea740dffe29484a3c75dcd08fcb0a29dbc9eed13a310095cfb9e7
MD5 decad1b7e5cd362c21077cab5d7d8d1e
BLAKE2b-256 7e2c7944acb431bef4f8be8c031a2fdb610928b91d02b17b35d94f7fe976aa46

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page