Skip to main content

A python implementation of the classic reinforcement learning domain pinball: http://irl.cs.brown.edu/pinball/

Project description

PynBall

Python implementation of the classic Pinball domain.

The goal is to navigate a physically modelled ball around a number of obstacles to reach a target. The ball will bounce elastically off obstacles. The agent can apply small forces to the ball, accelerating it in the $x$ or $y$ axes.

Dynamics

The domain has a 4-dimensional continuous state space and a 1-dimensional discrete action space. Transition dynamics are stochastic with configuration.

State space:

State is representated as the ball position and velocity: $(x, y, \dot{x}, \dot{y})$.

Action space:

There are five integer actions available to the agent in each state:

  • 0: Increase velocity in $x$,
  • 1: Increase velocity in $y$,
  • 2: Decrease velocity in $x$,
  • 3: Decrease velocity in $y$,
  • 4: No-Operation (configurable).

Changes to velocity are stochastic, modelled as normal distribution centered around the requested change, with a configurable standard deviation.

Rewards:

  • -1 for No-Operation action,
  • -5 for all other actions,
  • +10,000 for reaching the goal.

Have a go

To play interactively run python -m pynball_rl and select a difficulty between 1 and 3.

Configurations

A number of configuration files are provided in pynball_rl.configs. Configuration parameters are:

  • seed: Seed for random number generator
  • step_duration: Number of dynamics calculations per step. A larger value will improve robustness but reduce FPS.
  • drag: Drag coefficient. The ball velocity is multiplied by this at the end of each step. Setting to 0.0 will effectively make the state space 2-dimensional $(x,y)$.
  • stddev_x: The standard deviation of the normal distribution from which the change in $x$-velocity is sampled. Set to 0.0 for deterministic dynamics.
  • stddev_y: The standard deviation of the normal distribution from which the change in $y$-velocity is sampled. Set to 0.0 for deterministic dynamics.
  • allow_noop : Whether to include the no-operation action in the state space.

Additionally ball start location and radius, target location and radius, and obstacle placements can be set through configuration.

Acknowledgements

The pinball domain was introduced in:

G.D. Konidaris and A.G. Barto. Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining. Advances in Neural Information Processing Systems 22, December 2009.

This implementation is based on:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pynball_rl-1.0.0.tar.gz (14.2 kB view details)

Uploaded Source

Built Distribution

pynball_rl-1.0.0-py3-none-any.whl (18.4 kB view details)

Uploaded Python 3

File details

Details for the file pynball_rl-1.0.0.tar.gz.

File metadata

  • Download URL: pynball_rl-1.0.0.tar.gz
  • Upload date:
  • Size: 14.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.10.4 Darwin/23.3.0

File hashes

Hashes for pynball_rl-1.0.0.tar.gz
Algorithm Hash digest
SHA256 0a5e5a9b8e09dde3549534dd83fa798f0136d4f4e4fe4df22da43336bb78fb25
MD5 4d5894f40b425154a65769e934475f27
BLAKE2b-256 005712cd13095ed52c271db8784547c796bb731a05a1708c21b0db0e5ccc3881

See more details on using hashes here.

File details

Details for the file pynball_rl-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: pynball_rl-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 18.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.10.4 Darwin/23.3.0

File hashes

Hashes for pynball_rl-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c127f258e9b97e00a0c3b4ac9a47ec09195910049c37bed377807c241f08ce6e
MD5 d8d5f890e54b93e55f0601f57aaddd04
BLAKE2b-256 f3b1b74a448080f19c2c19137da0ddf2b45b26be95669dd4d4cfa1231f72ef66

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page