A reinforcement learning environment for HiFight's Footsies game
Project description
FootsiesGym
Implementation of HiFight's Footsies game as a reinforcement learning environment. This environment serves as a benchmark for multi-agent reinforcement learning in a (relatively) complex two-player zero-sum game.
The environment is derived from the open-source Unity implementation, which has been augmented to run a gRPC server that can be controlled through a Python harness. Training is implemented using Ray's RLlib.
System Architecture
sequenceDiagram
participant RLlib as Ray RLlib
participant Env as FootsiesEnv
participant gRPC as gRPC Client
participant Server as Unity Game Server
participant Game as Footsies Game
Note over RLlib,Env: Python Environment
Note over gRPC: Communication Layer
Note over Server,Game: Unity Game
RLlib->>Env: step(action)
Env->>gRPC: SendAction(action)
gRPC->>Server: gRPC Request
Server->>Game: Update Game State
Game->>Server: Game State
Server->>gRPC: gRPC Response
gRPC->>Env: Game State
Env->>RLlib: (obs., rews., terms., truncs., infos)
Note over RLlib,Game: Training Loop
The diagram above shows how the different components interact during training:
- RLlib sends actions to the FootsiesEnv
- The environment converts these actions into gRPC requests
- The Unity Game Server processes the actions and updates the game state
- The game state is sent back through gRPC to the environment
- The environment processes the observation and returns it to RLlib
Installation
conda create -n footsiesgym python=3.10
conda activate footsiesgym
pip install -r requirements.txt
On a Mac, you may need to ensure you have cmake installed. You can install it using Homebrew:
brew install cmake
Training
Game Servers
If you are on a Linux system, run setup.sh to unpack the binaries then run skip to the training procedure. Otherwise, follow the steps below.
Before training, you'll need to launch the headless game servers. Scripts are provided to do so in scripts/start_local_{mac, linux}_servers.sh, but you must first unpack the binaries that are included into the binaries/ directory (the launch scripts assume this location). Important! If you are launching game servers manually, be sure to set launch_binaries to False in the environment configuration.
./scripts/start_local_{mac, linux}_servers.sh <num-train-servers> <num-eval-servers>
The two arguments correspond to num_env_runners and evaluation_num_env_runners, which can be specified in the experiment configuration. You must launch a corresponding number of servers for each. If you are running local debugging (see below; python -m experiments.train --debug), just launch one of each. If you're launching a full experiment, you'll need to match the number specified in the experiment configuration (defaults to 40 training and 5 evaluation env runners).
The scripts will start:
- Training servers from port 50051 (incrementing for each server)
- Evaluation servers from port 40051 (incrementing for each server)
Importantly, we map environment runners to a single port, which means that you can only run a single environment per environment runner.
Training Configuration
The default training utilizes the APPO algorithm (see the corresponding IMPACT paper). We also utilize a vanilla LSTM newtwork with parameters described in the respective experiment files.
Training can utilize either the new RLModule stack or old-stack in RLlib. Some functionality has yet to be implemented in the new stack (see open issues).
Old Stack
python -m experiments.train --experiment-name <experiment-name>
New Stack
python -m experiments.train_rlmodule --experiment-name <experiment-name>
Add the --debug flag to use only a single env runner (and single evaluation env runner) and local mode. This will enable breakpoint usage for local debugging.
Visualizing a Policy
To visualize gameplay:
-
Unpack the windowed build binaries of your choice (Mac or Linux).
-
Add the trained policy specification to the
ModuleRepositoryincomponents/module_repository.py:
FootsiesModuleSpec(
module_name="<policy-nickname>",
experiment_name="<experiment-name>",
trial_id="<trial-id>", # specify if experiment has multiple trials
checkpoint_number=-1, # -1 for latest, otherwise specify checkpoint number
)
- Run the game with:
./footsies_linux_windowed_021725 --port 80051
- Configure policies in
scripts/local_inference.pyusing theMODULESvariable. Set"p1"to"human"to play against the AI (must installpygame).
Project Architecture
Core Components
- Environment (
footsies/): The main game environment implementation that interfaces with the Unity game through gRPC. - Models (
models/): Neural network architectures for the RL agents - Experiments (
experiments/): Training configurations and experiment management - Callbacks (
callbacks/): Custom RLlib callbacks for monitoring and evaluation - Components (
components/): Reusable components like the module repository for policy management - Utils (
utils/): Utility functions and helper classes - Scripts (
scripts/): Helper scripts for server management and visualization
Key Features
- Multi-agent reinforcement learning environment
- gRPC-based communication with Unity game server
- Support for both headless and windowed game modes
- Integration with Ray RLlib for distributed training
- Custom LSTM-based policy networks
- Support for self-play training
- Evaluation against baseline policies (random, noop, back)
- Wandb integration for experiment tracking
Development
gRPC / Protobuf Updates
If updating the proto definitions:
- Generate C# files (Windows):
.\protoc\bin\protoc.exe --csharp_out=.\env\game\proto\ --grpc_out=.\env\game\proto\ --plugin=protoc-gen-grpc=.\plugins\grpc_csharp_plugin.exe .\env\game\proto\footsies_service.proto
- Generate Python files:
python -m grpc_tools.protoc -I. --python_out=. --grpc_python_out=. .\env\game\proto\footsies_service.proto
Project Structure
FootsiesGym/
├── binaries/ # Game server binaries
├── callbacks/ # RLlib callbacks
├── components/ # Reusable components
├── experiments/ # Training configurations
├── footsies/ # Core environment
├── models/ # Neural network architectures
├── protoc/ # Protocol buffer tools
├── scripts/ # Helper scripts
├── testing/ # Test files
└── utils/ # Utility functions
Contributing
- Install pre-commit hooks to maintain code quality
- Follow the existing code style and architecture
- Add tests for new features
- Update documentation as needed
License
This project is based on the open-source Footsies game by HiFight. Please refer to the original game's license for more information.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file footsies_gym-0.1.8.tar.gz.
File metadata
- Download URL: footsies_gym-0.1.8.tar.gz
- Upload date:
- Size: 64.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c6aa3dca082f67a2db6ca0322116ba7556fcf44862fecf87fc6784e361563bae
|
|
| MD5 |
b8fff989cd236b559f184cd23b28c8d9
|
|
| BLAKE2b-256 |
76b12516f2c0fbb8272ca62babce89bbd7c2da37990651b140ee6acc37d1ddd2
|
File details
Details for the file footsies_gym-0.1.8-py3-none-any.whl.
File metadata
- Download URL: footsies_gym-0.1.8-py3-none-any.whl
- Upload date:
- Size: 81.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d6cd6a948e99602e3e5d69d4aacc85440726b5d51b2bc0fb22c7f3717aeb32b1
|
|
| MD5 |
8436732f20aaf4693bfe581f4457aee5
|
|
| BLAKE2b-256 |
8cc3a9361f477796a4ec802b526ee485dcd007eb96265c90cf609b82e6fc3382
|