Loose building blocks to create agent-environment loops.
Project description
🌐 Environment Framework
This repository contains the Python package environment-framework
. The project aims to provide loose building blocks to manage the logic, observation, estimation and visualization of an agent-environment loop. It can be used to implement problems which might be solved with reinforcement learning or dynamic programming algorithms.
A wrapper around gymnasium
is provided to connect to well-known frameworks in the field.
The wrapper for gymnasium uses the gymnasium>=0.26 API structure!
🤔 Why create this project?
The project emerges from a previous project of mine. It was used to separate the different elements of the projects agent-environment-loop.
🚀 Get Started
Installation
pip3 install environment-framework
👩🏫 GridWorld Example
The implemented example of GridWorld
can also be found in a Jupyter notebook grid_world.ipynb.
pip3 install "environment-framework[extra]"
jupyter lab
class Action:
UP = 0
DOWN = 1
RIGHT = 2
LEFT = 3
class GridWorldGame:
def __init__(self, size: int) -> None:
self.size = size
self.player_position = (0, 0)
self.target_position = (0, 0)
self.reset()
@property
def done(self) -> bool:
return self.player_position == self.target_position
@property
def space(self) -> Space:
return Discrete(4)
def act(self, action: int, **_: Any) -> None:
if action == Action.UP:
self.player_position = (self.player_position[0], self.player_position[1] - 1)
if action == Action.DOWN:
self.player_position = (self.player_position[0], self.player_position[1] + 1)
if action == Action.RIGHT:
self.player_position = (self.player_position[0] + 1, self.player_position[1])
if action == Action.LEFT:
self.player_position = (self.player_position[0] - 1, self.player_position[1])
corrected_x = max(0, min(self.size - 1, self.player_position[0]))
corrected_y = max(0, min(self.size - 1, self.player_position[1]))
self.player_position = (corrected_x, corrected_y)
def reset(self) -> None:
def get_random_position() -> int:
return randint(0, self.size - 1)
self.player_position = (get_random_position(), get_random_position())
self.target_position = (get_random_position(), get_random_position())
if self.done:
self.reset()
class GridWorldObserver:
def __init__(self, game: GridWorldGame) -> None:
self.game = game
@property
def space(self) -> Space:
return Box(shape=(4,), low=-math.inf, high=math.inf)
def observe(self) -> NDArray:
return np.array(
[*self.game.player_position, *self.game.target_position],
dtype=np.float32,
)
class GridWorldEstimator:
def __init__(self, game: GridWorldGame) -> None:
self.game = game
def estimate(self) -> float:
return -1 + float(self.game.done)
class GridWorldVisualizer(PygameHumanVisualizer):
BLUE = [0, 0, 255]
GREEN = [0, 255, 0]
def __init__(self, game: GridWorldGame) -> None:
super().__init__(50)
self.game = game
def render_rgb(self) -> NDArray[np.uint8]:
frame = [[[0 for k in range(3)] for j in range(self.game.size)] for i in range(self.game.size)]
frame[self.game.player_position[1]][self.game.player_position[0]] = self.BLUE
frame[self.game.target_position[1]][self.game.target_position[0]] = self.GREEN
return np.array(frame, dtype=np.uint8)
class GridWorldLevel(Level):
_game: GridWorldGame
_observer: GridWorldObserver
_estimator: GridWorldEstimator
_visualizer: GridWorldVisualizer
def reset(self) -> None:
self._game.reset()
def step(self, action: int) -> Any:
self._game.act(action)
game = GridWorldGame(7)
level = GridWorldLevel(
game,
GridWorldObserver(game),
GridWorldEstimator(game),
GridWorldVisualizer(game),
)
simulator = Simulator(level, 50)
FPS = 4
DONE = False
while not DONE:
action = simulator.action_space.sample()
simulator.step(action)
obs = simulator.observe()
reward = simulator.estimate()
simulator.render_human(FPS)
DONE = simulator.truncated or simulator.done
simulator.close()
📃 Documentation
Some doc-strings are already added. Documentation is a work-in-progress and will be updated on a time by time basis.
💃🕺 Contribution
I welcome everybody contributing to this project. Please read the CONTRIBUTING.md for more information. Feel free to open an issue on the project if you have any further questions.
💻 Development
The repository provides tools for development using hatch.
All dependencies for the project also can be found in a requirements
-file.
Install the development dependencies.
pip3 install -r requirements/dev.txt
or
pip3 install "environment-framework[dev]"
Tools
To run all development tools, type checking, linting and tests hatch
is required.
make all
Type checking
Type checking with mypy
.
make mypy
Linting
Linting with pylint
.
make lint
Tests
Run tests with pytest
.
make test
Update dependencies
Update python requirements with pip-compile
.
make update
🧾 License
This repository is licensed under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file environment_framework-0.3.0.tar.gz
.
File metadata
- Download URL: environment_framework-0.3.0.tar.gz
- Upload date:
- Size: 22.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.27.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8c11a1fe713c5130a020b432df9119c89ce43bef0fe79935c56de54f5a776546 |
|
MD5 | cbb14e33913aaff830a9ea51471a9aef |
|
BLAKE2b-256 | 44911bb7dd0eaeaf2ef175ece52fd7b84e0f62cd5fa3c537c63a51ca7a7b999b |
File details
Details for the file environment_framework-0.3.0-py3-none-any.whl
.
File metadata
- Download URL: environment_framework-0.3.0-py3-none-any.whl
- Upload date:
- Size: 11.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: python-httpx/0.27.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b1a36d70f59248aee18e7c5cfdd0ae4cb67a6b4419f7bc7387b0fabe731b900b |
|
MD5 | 3d83936d09cdc28a72a58ae32fba0dbe |
|
BLAKE2b-256 | ca00806e8077e1e7c56b71463db30b24c5fc46cfb679ee7fe0bcecc902e79644 |