Skip to main content

A python implementation of the classic reinforcement learning domain pinball: http://irl.cs.brown.edu/pinball/

Project description

PynBall

Python implementation of the classic Pinball domain.

The goal is to navigate a physically modelled ball around a number of obstacles to reach a target. The ball will bounce elastically off obstacles. The agent can apply small forces to the ball, accelerating it in the $x$ or $y$ axes.

Dynamics

The domain has a 4-dimensional continuous state space and a 1-dimensional discrete action space. Transition dynamics are stochastic with configuration.

State space:

State is representated as the ball position and velocity: $(x, y, \dot{x}, \dot{y})$.

Action space:

There are five integer actions available to the agent in each state:

  • 0: Increase velocity in $x$,
  • 1: Increase velocity in $y$,
  • 2: Decrease velocity in $x$,
  • 3: Decrease velocity in $y$,
  • 4: No-Operation (configurable).

Changes to velocity are stochastic, modelled as normal distribution centered around the requested change, with a configurable standard deviation.

Rewards:

  • -1 for No-Operation action,
  • -5 for all other actions,
  • +10,000 for reaching the goal.

Have a go

To play interactively run python -m pynball_rl and select a difficulty between 1 and 3.

Configurations

A number of configuration files are provided in pynball_rl.configs. Configuration parameters are:

  • seed: Seed for random number generator
  • step_duration: Number of dynamics calculations per step. A larger value will improve robustness but reduce FPS.
  • drag: Drag coefficient. The ball velocity is multiplied by this at the end of each step. Setting to 0.0 will effectively make the state space 2-dimensional $(x,y)$.
  • stddev_x: The standard deviation of the normal distribution from which the change in $x$-velocity is sampled. Set to 0.0 for deterministic dynamics.
  • stddev_y: The standard deviation of the normal distribution from which the change in $y$-velocity is sampled. Set to 0.0 for deterministic dynamics.
  • allow_noop : Whether to include the no-operation action in the state space.

Additionally ball start location and radius, target location and radius, and obstacle placements can be set through configuration.

Acknowledgements

The pinball domain was introduced in:

G.D. Konidaris and A.G. Barto. Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining. Advances in Neural Information Processing Systems 22, December 2009.

This implementation is based on:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pynball_rl-1.0.2.tar.gz (14.4 kB view details)

Uploaded Source

Built Distribution

pynball_rl-1.0.2-py3-none-any.whl (19.3 kB view details)

Uploaded Python 3

File details

Details for the file pynball_rl-1.0.2.tar.gz.

File metadata

  • Download URL: pynball_rl-1.0.2.tar.gz
  • Upload date:
  • Size: 14.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.10.4 Darwin/23.4.0

File hashes

Hashes for pynball_rl-1.0.2.tar.gz
Algorithm Hash digest
SHA256 d8142b623051a7e1c39097377d5b4c84c3e5e77be686e411c9d3cf895c7b9dfe
MD5 d6f737abb918c8d442a6d5e8ca04143f
BLAKE2b-256 fd3bf092b824bc2305da4bc8d14188146a47cfdf785219bc7a336c7baf4c7b3b

See more details on using hashes here.

File details

Details for the file pynball_rl-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: pynball_rl-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 19.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.10.4 Darwin/23.4.0

File hashes

Hashes for pynball_rl-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 459c98d97b275f0b0e96acd6423bbbbd0df3fa02c6328992cc87d932c86eddc2
MD5 2f8130373d04f5bd4f48a22911f8cdb2
BLAKE2b-256 91cae2b0d7faa34ba2e8aed78f6f0f5db5e8a620c51b72e3e613d5887ed4c494

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page