RL environments and learning code for traffic signal control in SUMO.
Project description
SUMO-RL
SUMO-RL provides a simple interface to instantiate Reinforcement Learning environments with SUMO for Traffic Signal Control.
The main class SumoEnvironment behaves like a MultiAgentEnv from RLlib.
If instantiated with parameter 'single-agent=True', it behaves like a regular Gym Env from OpenAI.
Call env or parallel_env for PettingZoo environment support.
TrafficSignal is responsible for retrieving information and actuating on traffic lights using TraCI API.
Goals of this repository:
- Provide a simple interface to work with Reinforcement Learning for Traffic Signal Control using SUMO
- Support Multiagent RL
- Compatibility with gym.Env and popular RL libraries such as stable-baselines3 and RLlib
- Easy customisation: state and reward definitions are easily modifiable
Install
Install SUMO latest version:
sudo add-apt-repository ppa:sumo/stable
sudo apt-get update
sudo apt-get install sumo sumo-tools sumo-doc
Don't forget to set SUMO_HOME variable (default sumo installation path is /usr/share/sumo)
echo 'export SUMO_HOME="/usr/share/sumo"' >> ~/.bashrc
source ~/.bashrc
Important: for a huge performance boost (~8x) with Libsumo, you can declare the variable:
export LIBSUMO_AS_TRACI=1
Notice that you will not be able to run with sumo-gui or with multiple simulations in parallel if this is active (more details).
Install SUMO-RL
Stable release version is available through pip
pip install sumo-rl
Alternatively you can install using the latest (unreleased) version
git clone https://github.com/LucasAlegre/sumo-rl
cd sumo-rl
pip install -e .
MDP - Observations, Actions and Rewards
Observation
The default observation for each traffic signal agent is a vector:
obs = [phase_one_hot, min_green, lane_1_density,...,lane_n_density, lane_1_queue,...,lane_n_queue]
phase_one_hot
is a one-hot encoded vector indicating the current active green phasemin_green
is a binary variable indicating whether min_green seconds have already passed in the current phaselane_i_density
is the number of vehicles in incoming lane i dividided by the total capacity of the lanelane_i_queue
is the number of queued (speed below 0.1 m/s) vehicles in incoming lane i divided by the total capacity of the lane
You can define your own observation changing the method 'compute_observation' of TrafficSignal.
Actions
The action space is discrete. Every 'delta_time' seconds, each traffic signal agent can choose the next green phase configuration.
E.g.: In the 2-way single intersection there are |A| = 4 discrete actions, corresponding to the following green phase configurations:
Important: every time a phase change occurs, the next phase is preeceded by a yellow phase lasting yellow_time
seconds.
Rewards
The default reward function is the change in cumulative vehicle delay:
That is, the reward is how much the total delay (sum of the waiting times of all approaching vehicles) changed in relation to the previous time-step.
You can define your own reward function changing the method 'compute_reward' of TrafficSignal.
Examples
PettingZoo API
env = sumo_rl.env(net_file='sumo_net_file.net.xml',
route_file='sumo_route_file.rou.xml',
use_gui=True,
num_seconds=3600)
env.reset()
for agent in env.agent_iter():
observation, reward, done, info = env.last()
action = policy(observation)
env.step(action)
RESCO Benchmarks
In the folder nets/RESCO you can find the network and route files from RESCO (Reinforcement Learning Benchmarks for Traffic Signal Control), which was built on top of SUMO-RL. See their paper for results.
Experiments
Check experiments to see how to instantiate an environment and use it with your RL algorithm.
Q-learning in a one-way single intersection:
python3 experiments/ql_single-intersection.py
RLlib A3C multiagent in a 4x4 grid:
python3 experiments/a3c_4x4grid.py
stable-baselines3 DQN in a 2-way single intersection:
python3 experiments/dqn_2way-single-intersection.py
Plotting results:
python3 outputs/plot.py -f outputs/2way-single-intersection/a3c
Citation
If you use this repository in your research, please cite:
@misc{sumorl,
author = {Lucas N. Alegre},
title = {{SUMO-RL}},
year = {2019},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/LucasAlegre/sumo-rl}},
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for sumo_rl-1.1-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6691b6deeac4ffc0295011e4e1d4a409a2f04a2b1a60ec53966624370563f031 |
|
MD5 | bae7dc38e1051904b753012c67166852 |
|
BLAKE2b-256 | 3da203cb76e1c3dd2a9b350addc3c6a39189cd79126e36170363f8ab01730717 |