A python implementation of the classic reinforcement learning domain pinball: http://irl.cs.brown.edu/pinball/
Project description
PynBall
Python implementation of the classic Pinball domain.
The goal is to navigate a physically modelled ball around a number of obstacles to reach a target. The ball will bounce elastically off obstacles. The agent can apply small forces to the ball, accelerating it in the $x$ or $y$ axes.
Dynamics
The domain has a 4-dimensional continuous state space and a 1-dimensional discrete action space. Transition dynamics are stochastic with configuration.
State space:
State is representated as the ball position and velocity: $(x, y, \dot{x}, \dot{y})$.
Action space:
There are five integer actions available to the agent in each state:
- 0: Increase velocity in $x$,
- 1: Increase velocity in $y$,
- 2: Decrease velocity in $x$,
- 3: Decrease velocity in $y$,
- 4: No-Operation (configurable).
Changes to velocity are stochastic, modelled as normal distribution centered around the requested change, with a configurable standard deviation.
Rewards:
- -1 for No-Operation action,
- -5 for all other actions,
- +10,000 for reaching the goal.
Have a go
To play interactively run python -m pynball_rl
and select a difficulty between 1 and 3.
Configurations
A number of configuration files are provided in pynball_rl.configs
. Configuration parameters are:
seed
: Seed for random number generatorstep_duration
: Number of dynamics calculations per step. A larger value will improve robustness but reduce FPS.drag
: Drag coefficient. The ball velocity is multiplied by this at the end of each step. Setting to 0.0 will effectively make the state space 2-dimensional $(x,y)$.stddev_x
: The standard deviation of the normal distribution from which the change in $x$-velocity is sampled. Set to 0.0 for deterministic dynamics.stddev_y
: The standard deviation of the normal distribution from which the change in $y$-velocity is sampled. Set to 0.0 for deterministic dynamics.allow_noop
: Whether to include the no-operation action in the state space.
Additionally ball start location and radius, target location and radius, and obstacle placements can be set through configuration.
Acknowledgements
The pinball domain was introduced in:
G.D. Konidaris and A.G. Barto. Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining. Advances in Neural Information Processing Systems 22, December 2009.
This implementation is based on:
- Original Java implementation at http://irl.cs.brown.edu/pinball/
- Python2 implementation at https://github.com/amarack/python-rl
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pynball_rl-1.0.1.tar.gz
.
File metadata
- Download URL: pynball_rl-1.0.1.tar.gz
- Upload date:
- Size: 14.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.10.4 Darwin/23.3.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | a2d90f22ea653493b0fb96cda9d71ffff2aeefa6d58d41789160e87a71ba2e23 |
|
MD5 | fe03b73c96b147f50fbbc8781671877c |
|
BLAKE2b-256 | e97cbec1acc438d274513dc616c0a982c8b6419cb20329450a06552621e28f27 |
File details
Details for the file pynball_rl-1.0.1-py3-none-any.whl
.
File metadata
- Download URL: pynball_rl-1.0.1-py3-none-any.whl
- Upload date:
- Size: 18.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.7.1 CPython/3.10.4 Darwin/23.3.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | de70184f992130ceaa2e3b07718a3236fb99b82906805fce98a1234575811c0b |
|
MD5 | b03212a82ab7927ebc23a5763418aa0c |
|
BLAKE2b-256 | ab58a1322fc1a1fd8d1a63a77adb9a1744ce90d9956df7684e56f8fcd2b0c8cd |