Skip to main content

A reinforcement learning environment for the Search Race CG puzzle based on Gymnasium

Project description

Gymnasium Search Race

Build Python Package Python PyPI pre-commit Code style: black Imports: isort

Gymnasium environments for the Search Race CodinGame optimization puzzle and Mad Pod Racing CodinGame bot programming game.

https://github.com/user-attachments/assets/1862b04b-9e33-4f55-a309-ad665a1db2f1

Action Space Box([-1, 0], [1, 1], float64)
Observation Space Box([0, 0, 0, 0, 0, 0, 0, -1, -1, 0], [1, 1, 1, 1, 1, 1, 1, 1, 1, 1], float64)
import gymnasium.make("gymnasium_search_race:gymnasium_search_race/SearchRace-v1")

Installation

To install gymnasium-search-race with pip, execute:

pip install gymnasium_search_race

From source:

git clone https://github.com/Quentin18/gymnasium-search-race
cd gymnasium-search-race/
pip install -e .

Environment

Action Space

The action is a ndarray with 2 continuous variables:

  • The rotation angle between -18 and 18 degrees, normalized between -1 and 1.
  • The thrust between 0 and 200, normalized between 0 and 1.

Observation Space

The observation is a ndarray of 10 continuous variables:

  • 1 if the next checkpoint is the last one, 0 otherwise.
  • The x and y coordinates of the next checkpoint.
  • The x and y coordinates of the checkpoint after next checkpoint.
  • The x and y coordinates of the car.
  • The horizontal speed vx and vertical speed vy of the car.
  • The facing angle of the car.

The values are normalized between 0 and 1, or -1 and 1 if negative values are allowed.

Reward

The goal is to visit all checkpoints as quickly as possible, as such the agent is penalised with a reward of -0.1 for each timestep. When a checkpoint is visited, the agent is awarded with a reward of 1000/total_checkpoints.

Starting State

The starting state is generated by choosing a random CodinGame test case.

Episode End

The episode ends if either of the following happens:

  1. Termination: The car visit all checkpoints before the time is out.
  2. Truncation: Episode length is greater than 600.

Arguments

  • test_id: test case id to generate the checkpoints (see choices here). The default value is None which selects a test case randomly when the reset method is called.
import gymnasium as gym

gym.make("gymnasium_search_race:gymnasium_search_race/SearchRace-v1", test_id=1)

Version History

  • v1: Add boolean to indicate if the next checkpoint is the last checkpoint in observation
  • v0: Initial version

Discrete environment

The SearchRaceDiscrete environment is similar to the SearchRace environment except the action space is discrete.

import gymnasium as gym

gym.make("gymnasium_search_race:gymnasium_search_race/SearchRaceDiscrete-v1", test_id=1)

Action Space

There are 74 discrete actions corresponding to the combinations of angles from -18 to 18 degrees and thrust 0 and 200.

Version History

  • v1: Add all angles in action space
  • v0: Initial version

Mad Pod Racing

Runner

The MadPodRacing and MadPodRacingDiscrete environments can be used to train a runner for the Mad Pod Racing CodinGame bot programming game. They are similar to the SearchRace and SearchRaceDiscrete environments except the following differences:

  • The maximum thrust value is 100 instead of 200.
  • The maps are generated the same way Codingame generates them.
  • The car position is rounded and not truncated.
import gymnasium as gym

gym.make("gymnasium_search_race:gymnasium_search_race/MadPodRacing-v0")
gym.make("gymnasium_search_race:gymnasium_search_race/MadPodRacingDiscrete-v0")

https://github.com/user-attachments/assets/ce4b1837-4591-40dd-a203-9eec9146b94b

Blocker

The MadPodRacingBlocker environment can be used to train a blocker for the Mad Pod Racing CodinGame bot programming game.

import gymnasium as gym

gym.make("gymnasium_search_race:gymnasium_search_race/MadPodRacingBlocker-v0")

https://github.com/user-attachments/assets/57387372-823f-44a2-9a03-23a9332752ab

Usage

You can use RL Baselines3 Zoo to train and evaluate agents:

pip install rl_zoo3

Train an Agent

The hyperparameters are defined in hyperparams/ppo.yml.

To train a PPO agent for the Search Race game, execute:

python -m rl_zoo3.train \
  --algo ppo \
  --env gymnasium_search_race/SearchRace-v1 \
  --tensorboard-log logs \
  --eval-freq 20000 \
  --eval-episodes 10 \
  --gym-packages gymnasium_search_race \
  --conf-file hyperparams/ppo.yml \
  --progress

For the Mad Pod Racing game, you can add an opponent with the opponent_path argument:

python -m rl_zoo3.train \
  --algo ppo \
  --env gymnasium_search_race/MadPodRacingBlocker-v0 \
  --tensorboard-log logs \
  --eval-freq 20000 \
  --eval-episodes 10 \
  --gym-packages gymnasium_search_race \
  --env-kwargs "opponent_path:'rl-trained-agents/ppo/gymnasium_search_race-MadPodRacing-v0_1/best_model.zip'" \
  --conf-file hyperparams/ppo.yml \
  --progress

Enjoy a Trained Agent

To see a trained agent in action on random test cases, execute:

python -m rl_zoo3.enjoy \
  --algo ppo \
  --env gymnasium_search_race/SearchRace-v1 \
  --n-timesteps 1000 \
  --deterministic \
  --gym-packages gymnasium_search_race \
  --load-best \
  --progress

Run Test Cases

To run test cases with a trained agent, execute:

python -m scripts.run_test_cases \
  --path rl-trained-agents/ppo/gymnasium_search_race-SearchRace-v1_1/best_model.zip \
  --env gymnasium_search_race:gymnasium_search_race/SearchRace-v1 \
  --record-video \
  --record-metrics

Record a Video of a Trained Agent

To record a video of a trained agent on Mad Pod Racing, execute:

python -m scripts.record_video \
  --path rl-trained-agents/ppo/gymnasium_search_race-MadPodRacing-v0_1/best_model.zip \
  --env gymnasium_search_race:gymnasium_search_race/MadPodRacing-v0

For Mad Pod Racing Blocker, execute:

python -m scripts.record_video \
  --path rl-trained-agents/ppo/gymnasium_search_race-MadPodRacingBlocker-v0_1/best_model.zip \
  --opponent-path rl-trained-agents/ppo/gymnasium_search_race-MadPodRacing-v0_1/best_model.zip \
  --env gymnasium_search_race:gymnasium_search_race/MadPodRacingBlocker-v0

Tests

To run tests, execute:

pytest

Citing

To cite the repository in publications:

@misc{gymnasium-search-race,
  author = {Quentin Deschamps},
  title = {Gymnasium Search Race},
  year = {2024},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/Quentin18/gymnasium-search-race}},
}

References

Assets

Author

Quentin Deschamps

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gymnasium_search_race-3.0.0.tar.gz (2.9 MB view details)

Uploaded Source

Built Distribution

gymnasium_search_race-3.0.0-py3-none-any.whl (420.8 kB view details)

Uploaded Python 3

File details

Details for the file gymnasium_search_race-3.0.0.tar.gz.

File metadata

  • Download URL: gymnasium_search_race-3.0.0.tar.gz
  • Upload date:
  • Size: 2.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for gymnasium_search_race-3.0.0.tar.gz
Algorithm Hash digest
SHA256 11ce215a10c908119e2cc2e4634e2c3403df2c8b9a86ffe6260019ce5c4338e2
MD5 2706ab508e2afa04ea853e4e6a262e0b
BLAKE2b-256 bc786ebf8af2267754a1ff94684c55830da8b6655d5292a90a4b7db78b6f21f5

See more details on using hashes here.

File details

Details for the file gymnasium_search_race-3.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for gymnasium_search_race-3.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2c73fedc8bf5190288a75a289b5fb0b576e0c1aac2e89e59640935070c372daa
MD5 67ad9c31af2ffd00134746e91bca49ca
BLAKE2b-256 a3befeb1f39597e837c8be3e92588d35c872b63d7dd6917827f5874cd1898dcf

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page