Skip to main content

A python implementation of the classic reinforcement learning domain pinball: http://irl.cs.brown.edu/pinball/

Project description

PynBall

Python implementation of the classic Pinball domain.

The goal is to navigate a physically modelled ball around a number of obstacles to reach a target. The ball will bounce elastically off obstacles. The agent can apply small forces to the ball, accelerating it in the $x$ or $y$ axes.

Dynamics

The domain has a 4-dimensional continuous state space and a 1-dimensional discrete action space. Transition dynamics are stochastic with configuration.

State space:

State is representated as the ball position and velocity: $(x, y, \dot{x}, \dot{y})$.

Action space:

There are five integer actions available to the agent in each state:

  • 0: Increase velocity in $x$,
  • 1: Increase velocity in $y$,
  • 2: Decrease velocity in $x$,
  • 3: Decrease velocity in $y$,
  • 4: No-Operation (configurable).

Changes to velocity are stochastic, modelled as normal distribution centered around the requested change, with a configurable standard deviation.

Rewards:

  • -1 for No-Operation action,
  • -5 for all other actions,
  • +10,000 for reaching the goal.

Have a go

To play interactively run python -m pynball_rl and select a difficulty between 1 and 3.

Configurations

A number of configuration files are provided in pynball_rl.configs. Configuration parameters are:

  • seed: Seed for random number generator
  • step_duration: Number of dynamics calculations per step. A larger value will improve robustness but reduce FPS.
  • drag: Drag coefficient. The ball velocity is multiplied by this at the end of each step. Setting to 0.0 will effectively make the state space 2-dimensional $(x,y)$.
  • stddev_x: The standard deviation of the normal distribution from which the change in $x$-velocity is sampled. Set to 0.0 for deterministic dynamics.
  • stddev_y: The standard deviation of the normal distribution from which the change in $y$-velocity is sampled. Set to 0.0 for deterministic dynamics.
  • allow_noop : Whether to include the no-operation action in the state space.

Additionally ball start location and radius, target location and radius, and obstacle placements can be set through configuration.

Acknowledgements

The pinball domain was introduced in:

G.D. Konidaris and A.G. Barto. Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining. Advances in Neural Information Processing Systems 22, December 2009.

This implementation is based on:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pynball_rl-1.0.1.tar.gz (14.3 kB view details)

Uploaded Source

Built Distribution

pynball_rl-1.0.1-py3-none-any.whl (18.4 kB view details)

Uploaded Python 3

File details

Details for the file pynball_rl-1.0.1.tar.gz.

File metadata

  • Download URL: pynball_rl-1.0.1.tar.gz
  • Upload date:
  • Size: 14.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.10.4 Darwin/23.3.0

File hashes

Hashes for pynball_rl-1.0.1.tar.gz
Algorithm Hash digest
SHA256 a2d90f22ea653493b0fb96cda9d71ffff2aeefa6d58d41789160e87a71ba2e23
MD5 fe03b73c96b147f50fbbc8781671877c
BLAKE2b-256 e97cbec1acc438d274513dc616c0a982c8b6419cb20329450a06552621e28f27

See more details on using hashes here.

File details

Details for the file pynball_rl-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: pynball_rl-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 18.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.10.4 Darwin/23.3.0

File hashes

Hashes for pynball_rl-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 de70184f992130ceaa2e3b07718a3236fb99b82906805fce98a1234575811c0b
MD5 b03212a82ab7927ebc23a5763418aa0c
BLAKE2b-256 ab58a1322fc1a1fd8d1a63a77adb9a1744ce90d9956df7684e56f8fcd2b0c8cd

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page