A reinforcement learning environment for the Search Race CG puzzle based on Gymnasium
Project description
Gymnasium Search Race
Gymnasium environment for the Search Race CodinGame optimization puzzle.
https://github.com/user-attachments/assets/1862b04b-9e33-4f55-a309-ad665a1db2f1
Action Space | Box([-1, 0], [1, 1], float64) |
Observation Space | Box([0, 0, 0, 0, 0, 0, -1, -1, 0], [1, 1, 1, 1, 1, 1, 1, 1, 1], float64) |
import | gymnasium.make("gymnasium_search_race:gymnasium_search_race/SearchRace-v0") |
Installation
To install gymnasium-search-race
with pip, execute:
pip install gymnasium_search_race
From source:
git clone https://github.com/Quentin18/gymnasium-search-race
cd gymnasium-search-race/
pip install -e .
Environment
Action Space
The action is a ndarray
with 2 continuous variables:
- The rotation angle between -18 and 18 degrees, normalized between -1 and 1.
- The thrust between 0 and 200, normalized between 0 and 1.
Observation Space
The observation is a ndarray
of 9 continuous variables:
- The x and y coordinates of the next checkpoint.
- The x and y coordinates of the checkpoint after next checkpoint.
- The x and y coordinates of the car.
- The horizontal speed vx and vertical speed vy of the car.
- The facing angle of the car.
The values are normalized between 0 and 1, or -1 and 1 if negative values are allowed.
Reward
The goal is to visit all checkpoints as quickly as possible, as such the agent is penalised with a reward of -0.1
for
each timestep.
When a checkpoint is visited, the agent is awarded with a reward of 1000/total_checkpoints
.
Starting State
The starting state is generated by choosing a random CodinGame test case.
Episode End
The episode ends if either of the following happens:
- Termination: The car visit all checkpoints before the time is out.
- Truncation: Episode length is greater than 600.
Arguments
test_id
: test case id to generate the checkpoints (see choices here). The default value isNone
which selects a test case randomly when thereset
method is called.
import gymnasium as gym
gym.make("gymnasium_search_race:gymnasium_search_race/SearchRace-v0", test_id=1)
Usage
You can use RL Baselines3 Zoo to train and evaluate agents:
pip install rl_zoo3
Train an Agent
The hyperparameters are defined in hyperparams/ppo.yml
.
To train a PPO agent for the Search Race game, execute:
python -m rl_zoo3.train \
--algo ppo \
--env gymnasium_search_race/SearchRace-v0 \
--tensorboard-log logs \
--eval-freq 10000 \
--eval-episodes 10 \
--gym-packages gymnasium_search_race \
--conf-file hyperparams/ppo.yml \
--progress
Enjoy a Trained Agent
To see a trained agent in action on random test cases, execute:
python -m rl_zoo3.enjoy \
--algo ppo \
--env gymnasium_search_race/SearchRace-v0 \
--n-timesteps 10000 \
--deterministic \
--gym-packages gymnasium_search_race \
--load-best \
--progress
Run Test Cases
To run test cases with a trained agent, execute:
python -m scripts.run_test_cases \
--path rl-trained-agents/ppo/best_model.zip \
--record-video \
--record-metrics
Tests
To run tests, execute:
pytest
Citing
To cite the repository in publications:
@misc{gymnasium-search-race,
author = {Quentin Deschamps},
title = {Gymnasium Search Race},
year = {2024},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/Quentin18/gymnasium-search-race}},
}
References
Author
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for gymnasium_search_race-0.0.1.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 78c0f2d78ffeb4ae62fee503341255a605e51f76b82008c8369f58b7a7cc4763 |
|
MD5 | cf10206bdac46afd8c70c7db6ad2e15e |
|
BLAKE2b-256 | 0a8e83062b52526f4a312766d17c6fe59b97905433cdb2b87d1cd70b17654c83 |
Hashes for gymnasium_search_race-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 999f7d79b7cf1d8641f196bed570b1ef17e970665c779f41d7604c39268f50e4 |
|
MD5 | 6aa0b1d909a41aae2ff916876b8b963f |
|
BLAKE2b-256 | f32f5c846861f5d892761d11e97d94002eacbc8f916cf76e25b89745aff00213 |