Loose building blocks to create agent-environment loops.
Project description
🌐 Environment Framework
This repository contains the Python package environment-framework
. The project aims to provide loose building blocks to manage the logic, observation, estimation and visualization of an agent-environment loop. It can be used to implement problems which might be solved with reinforcement learning or dynamic programming algorithms.
A wrapper around gymnasium
is provided to connect to well-known frameworks in the field.
The wrapper for gymnasium uses the gymnasium>=0.26 API structure!
🤔 Why create this project?
The project emerges from a previous project of mine. It was used to separate the different elements of the projects agent-environment-loop.
🚀 Get Started
Installation
pip3 install environment-framework
👩🏫 GridWorld Example
The implemented example of GridWorld
can also be found in a Jupyter notebook grid_world.ipynb.
pip3 install "environment-framework[extra]"
jupyter lab
class Action(Enum):
UP = 0
DOWN = 1
RIGHT = 2
LEFT = 3
class GridWorldGame:
def __init__(self, size: int) -> None:
self.size = size
self.player_position = (0, 0)
self.target_position = (0, 0)
self.reset()
@property
def done(self) -> bool:
return self.player_position == self.target_position
def act(self, action: Action, **_: Any) -> None:
if action == Action.UP:
self.player_position = (self.player_position[0], self.player_position[1] - 1)
if action == Action.DOWN:
self.player_position = (self.player_position[0], self.player_position[1] + 1)
if action == Action.RIGHT:
self.player_position = (self.player_position[0] + 1, self.player_position[1])
if action == Action.LEFT:
self.player_position = (self.player_position[0] - 1, self.player_position[1])
corrected_x = max(0, min(self.size - 1, self.player_position[0]))
corrected_y = max(0, min(self.size - 1, self.player_position[1]))
self.player_position = (corrected_x, corrected_y)
def reset(self) -> None:
def get_random_position() -> int:
return randint(0, self.size - 1)
self.player_position = (get_random_position(), get_random_position())
self.target_position = (get_random_position(), get_random_position())
if self.done:
self.reset()
class GridWorldObserver:
def __init__(self, game: GridWorldGame) -> None:
self.game = game
@property
def shape(self) -> Tuple[int, ...]:
return (4,)
def observe(self, _: Any) -> List[float]:
return [*self.game.player_position, *self.game.target_position]
class GridWorldEstimator:
def __init__(self, game: GridWorldGame) -> None:
self.game = game
def estimate(self, _: Any) -> float:
return -1 + float(self.game.done)
class GridWorldVisualizer:
# We use BGR
BLUE = [255, 0, 0]
GREEN = [0, 255, 0]
def __init__(self, game: GridWorldGame) -> None:
self.game = game
def render(self, _: Any) -> Any:
frame = [[[0 for k in range(3)] for j in range(self.game.size)] for i in range(self.game.size)]
frame[self.game.player_position[1]][self.game.player_position[0]] = self.BLUE
frame[self.game.target_position[1]][self.game.target_position[0]] = self.GREEN
return frame
class GridWorldLevel(Level):
_game: GridWorldGame
_observer: GridWorldObserver
_estimator: GridWorldEstimator
_visualizer: GridWorldVisualizer
def __init__(
self,
game: GridWorldGame,
observer: GridWorldObserver,
estimator: GridWorldEstimator,
visualizer: GridWorldVisualizer,
) -> None:
super().__init__(game, observer, estimator, visualizer)
def reset(self) -> None:
self._game.reset()
def step(self, action: Action) -> Any:
if isinstance(action, np.int64): # handle integer inputs
action = Action(action)
self._game.act(action)
@property
def action_space(self) -> ActionSpace:
return ActionSpace("discrete", 4)
@property
def observation_space(self) -> ObservationSpace:
return ObservationSpace("discrete", (4,))
class GridWorldSimulation:
def __init__(self, level: GridWorldLevel) -> None:
self.level = level
self.level_settings = None
game = GridWorldGame(7)
level = GridWorldLevel(game, GridWorldObserver(game), GridWorldEstimator(game), GridWorldVisualizer(game))
simulator = Simulator(GridWorldSimulation(level))
while not simulator.done:
action = Action(randint(0, 3))
simulator.step(action)
📃 Documentation
Some doc-strings are already added. Documentation is a work-in-progress and will be updated on a time by time basis.
💃🕺 Contribution
I welcome everybody contributing to this project. Please read the CONTRIBUTING.md for more information. Feel free to open an issue on the project if you have any further questions.
💻 Development
The repository provides tools for development using hatch.
All dependencies for the project also can be found in a requirements
-file.
Install the development dependencies.
pip3 install -r requirements/dev.txt
or
pip3 install "environment-framework[dev]"
Tools
To run all development tools, type checking, linting and tests hatch
is required.
make all
Type checking
Type checking with mypy
.
make mypy
Linting
Linting with pylint
.
make lint
Tests
Run tests with pytest
.
make test
Update dependencies
Update python requirements with pip-compile
.
make update
🧾 License
This repository is licensed under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for environment_framework-0.1.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5ec3814bdafcd16a1824c7f5021c7cfeae385312139c4273822cfb49244dc47f |
|
MD5 | 6e96da8045811aa24524a70617761e51 |
|
BLAKE2b-256 | c82e7676e0141ce0fcf80fc55feca909fbf5fb45c3956fdd3c9e03a51485fb07 |
Hashes for environment_framework-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 619bfe8b6261250b5f377abdf70de85d6426500cdedcdcf72c082072bc5ff444 |
|
MD5 | acac65371ace0c58e66008406736ad01 |
|
BLAKE2b-256 | 2b52ce318d9ab0dc8e28b5463b5954e8b4723ed14a9efcfd5569c0f39cda78a8 |