A reinforcement learning environment for the Search Race CG puzzle based on Gymnasium
Project description
Gymnasium Search Race
Gymnasium environments for the Search Race CodinGame optimization puzzle and Mad Pod Racing CodinGame bot programming game.
https://github.com/user-attachments/assets/766b4c79-1be7-48bd-a25b-2ff99de972f7
| Action Space | Box([-1, 0], [1, 1], float64) |
| Observation Space | Box(-1, 1, shape=(10,), float64) |
| import | gymnasium.make("gymnasium_search_race:gymnasium_search_race/SearchRace-v3") |
Installation
To install gymnasium-search-race with pip, execute:
pip install gymnasium_search_race
From source:
git clone https://github.com/Quentin18/gymnasium-search-race
cd gymnasium-search-race/
pip install -e .
Environment
Action Space
The action is a ndarray with 2 continuous variables:
- The rotation angle between -18 and 18 degrees, normalized between -1 and 1.
- The thrust between 0 and 200, normalized between 0 and 1.
Observation Space
The observation is a ndarray of 10 continuous variables:
- The relative x and y coordinates of the next two checkpoints in the car's frame.
- The sine and cosine of the relative angle to the next two checkpoints in the car's frame.
- The longitudinal and lateral speed in the car's frame.
The values are normalized between -1 and 1.
Reward
- +1 when a checkpoint is visited.
- 0 otherwise.
Starting State
The starting state is generated by choosing a random CodinGame test case.
Episode End
The episode ends if either of the following happens:
- Termination: The car visit all checkpoints before the time is out.
- Truncation: Episode length is greater than 600.
Arguments
laps: number of laps. The default value is3.car_max_thrust: maximum thrust. The default value is200.test_id: test case id to generate the checkpoints (see choices here). The default value isNonewhich selects a test case randomly when theresetmethod is called.sequential_maps: ifTrue, the maps are generated sequentially. The default value isFalse.
import gymnasium as gym
gym.make(
"gymnasium_search_race:gymnasium_search_race/SearchRace-v3",
laps=3,
car_max_thrust=200,
test_id=1,
sequential_maps=False,
)
Version History
- v3: Update observation with relative positions and angles in car's frame
- v2: Update observation with relative positions and angles
- v1: Add boolean to indicate if the next checkpoint is the last checkpoint in observation
- v0: Initial version
Discrete environment
The SearchRaceDiscrete environment is similar to the SearchRace environment except the action space is discrete.
import gymnasium as gym
gym.make(
"gymnasium_search_race:gymnasium_search_race/SearchRaceDiscrete-v3",
laps=3,
car_max_thrust=200,
test_id=1,
sequential_maps=False,
)
Action Space
There are 74 discrete actions corresponding to the combinations of angles from -18 to 18 degrees and thrust 0 and 200.
Version History
- v3: Update observation with relative positions and angles in car's frame
- v2: Update observation with relative positions and angles
- v1: Add all angles in action space
- v0: Initial version
Mad Pod Racing
Runner
The MadPodRacing and MadPodRacingDiscrete environments can be used to train a runner for
the Mad Pod Racing CodinGame bot programming game.
They are similar to the SearchRace and SearchRaceDiscrete environments except the following differences:
- The maps are generated the same way Codingame generates them.
- The car position is rounded and not truncated.
import gymnasium as gym
gym.make("gymnasium_search_race:gymnasium_search_race/MadPodRacing-v2")
gym.make("gymnasium_search_race:gymnasium_search_race/MadPodRacingDiscrete-v2")
https://github.com/user-attachments/assets/2e2a748d-5bd8-459a-8ac2-a8420bae33b9
Blocker
The MadPodRacingBlocker and MadPodRacingBlockerDiscrete environments can be used to train a blocker for
the Mad Pod Racing CodinGame bot programming game.
import gymnasium as gym
gym.make("gymnasium_search_race:gymnasium_search_race/MadPodRacingBlocker-v2")
gym.make("gymnasium_search_race:gymnasium_search_race/MadPodRacingBlockerDiscrete-v2")
https://github.com/user-attachments/assets/3c71a487-9ec1-49cd-9b8b-70f7984a809a
Arguments
opponent_path: path to the opponent PPO model. The default value isNonewhich means there is no opponent.boost_on_first_move: ifTrue, the car is boosted on the first move. The default value isFalse.boost_opponent_on_first_move: ifTrue, the opponent is boosted on the first move. The default value isFalse.
Version History
- v2: Update observation with relative positions and angles in car's frame and add boost options
- v1: Update observation with relative positions and angles and update maximum thrust
- v0: Initial version
Usage
You can use RL Baselines3 Zoo to train and evaluate agents:
pip install rl_zoo3
Train an Agent
The hyperparameters are defined in hyperparams/ppo.yml.
To train a PPO agent for the Search Race game, execute:
python -m rl_zoo3.train \
--algo ppo \
--env gymnasium_search_race/SearchRaceDiscrete-v3 \
--tensorboard-log logs \
--eval-freq 20000 \
--eval-episodes 50 \
--gym-packages gymnasium_search_race \
--env-kwargs "laps:1000" "sequential_maps:True" \
--conf-file hyperparams/ppo.yml \
--progress
[!IMPORTANT] The agent is evaluated once per test case with
--eval-episodes 50and--env-kwargs "sequential_maps:True"(there are 50 different test cases).
For the Mad Pod Racing game, you can add an opponent with the opponent_path argument:
python -m rl_zoo3.train \
--algo ppo \
--env gymnasium_search_race/MadPodRacingBlockerDiscrete-v2 \
--tensorboard-log logs \
--eval-freq 20000 \
--eval-episodes 52 \
--gym-packages gymnasium_search_race \
--env-kwargs \
"opponent_path:'rl-trained-agents/ppo/gymnasium_search_race-MadPodRacingDiscrete-v2_1/best_model.zip'" \
"laps:1000" \
"sequential_maps:True" \
"boost_opponent_on_first_move:True" \
--conf-file hyperparams/ppo.yml \
--progress
[!IMPORTANT] The agent is evaluated four times per test case with
--eval-episodes 52and--env-kwargs "sequential_maps:True"(there are 13 different test cases).
Enjoy a Trained Agent
To see a trained agent in action on random test cases, execute:
python -m rl_zoo3.enjoy \
--algo ppo \
--env gymnasium_search_race/SearchRaceDiscrete-v3 \
--n-timesteps 1000 \
--deterministic \
--gym-packages gymnasium_search_race \
--load-best \
--progress
Run Test Cases
To run test cases with a trained agent, execute:
python -m scripts.run_test_cases \
--path rl-trained-agents/ppo/gymnasium_search_race-SearchRaceDiscrete-v3_1/best_model.zip \
--env gymnasium_search_race:gymnasium_search_race/SearchRaceDiscrete-v3 \
--record-video \
--record-metrics
Record a Video of a Trained Agent
To record a video of a trained agent on Mad Pod Racing, execute:
python -m scripts.record_video \
--path rl-trained-agents/ppo/gymnasium_search_race-MadPodRacingDiscrete-v2_1/best_model.zip \
--env gymnasium_search_race:gymnasium_search_race/MadPodRacingDiscrete-v2
For Mad Pod Racing Blocker, execute:
python -m scripts.record_video \
--path rl-trained-agents/ppo/gymnasium_search_race-MadPodRacingBlockerDiscrete-v2_1/best_model.zip \
--opponent-path rl-trained-agents/ppo/gymnasium_search_race-MadPodRacingDiscrete-v2_1/best_model.zip \
--env gymnasium_search_race:gymnasium_search_race/MadPodRacingBlockerDiscrete-v2
Tests
To run tests, execute:
pytest
Citing
To cite the repository in publications:
@misc{gymnasium-search-race,
author = {Quentin Deschamps},
title = {Gymnasium Search Race},
year = {2024},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/Quentin18/gymnasium-search-race}},
}
References
- Gymnasium
- RL Baselines3 Zoo
- Stable Baselines3
- CGSearchRace
- CSB-Runner-Arena
- Coders Strikes Back by Magus
Assets
- https://www.flaticon.com/free-icon/space-ship_751036
- https://www.flaticon.com/free-icon/space-ship_784925
Author
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gymnasium_search_race-4.1.0.tar.gz.
File metadata
- Download URL: gymnasium_search_race-4.1.0.tar.gz
- Upload date:
- Size: 2.9 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3619719bcf6961804f512ea14ad0c130181463feb478277ccb34708405cdac92
|
|
| MD5 |
eccd422bb903f115de937b8a6ed5e9aa
|
|
| BLAKE2b-256 |
fb4176697f42325c6fa43b87c044080ce6007d36eefa32a544f27bbb5dd881ac
|
File details
Details for the file gymnasium_search_race-4.1.0-py3-none-any.whl.
File metadata
- Download URL: gymnasium_search_race-4.1.0-py3-none-any.whl
- Upload date:
- Size: 423.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
93e1ffae4eaa991babd54fece10976395492d9cbc164ea057cec839fc0f84edd
|
|
| MD5 |
a25b1c5b2c2c1941ad700d90b10aa92c
|
|
| BLAKE2b-256 |
3e6f1542f7b8bb654b2fd8e4be5a61549f71ab1f799f1224c035b259cb834ac5
|