Standalone reusable-booster landing environment for reinforcement learning.
Project description
Platform Lander
A standalone reusable-booster landing environment based on Gymnasium LunarLander v3 physics, but without importing Gymnasium. The task is to land a SpaceX-style booster upright on a moving floating platform. Missing the platform and falling into the ocean, or contacting the platform in a non-vertical position, terminates the episode as failure.
Install
After the package has been published to PyPI:
pip install platform_lander
Before the PyPI release is available, install the same package directly from the book repository subdirectory:
pip install "platform_lander @ git+https://github.com/aburkov/theDRLbook.git#subdirectory=test_environments/platform_lander"
For local development from this folder:
pip install -e .
Google Colab
Use the same install command in the first notebook cell. Colab usually needs swig before Box2D builds:
!apt-get -qq install swig
!pip install -q platform_lander
Then import normally:
from platform_lander import PlatformLander
env = PlatformLander(render_mode="rgb_array", enable_wind=True, wind_power=5.0)
obs, info = env.reset(seed=0)
obs, reward, terminated, truncated, info = env.step(2)
frame = env.render()
Display a rendered frame in Colab:
import matplotlib.pyplot as plt
plt.imshow(frame)
plt.axis("off")
plt.show()
Local Script
To watch the booster in a local Pygame window, install the package in editable mode and run the demo:
pip install -e .
python examples/demo.py
The test file is headless, so running pytest or python tests/test_platform_lander.py
will not open an animation window.
To train a discrete policy with the textbook single-trajectory REINFORCE algorithm and then show three animated runs:
pip install -e ".[train]"
python vanilla_reinforce.py
The repository also includes incremental REINFORCE variants:
python rtg_reinforce.py # vanilla + per-timestep reward-to-go
python average_reinforcement_baseline_reinforce.py # reward-to-go + running scalar RTG baseline
python value_function_baseline_reinforce.py # reward-to-go + learned value-function baseline
python batch_reinforce.py # vanilla + trajectory batches
python full_reinforce.py # batches + reward-to-go + selectable scalar baseline
Each training script writes a log, per-episode CSV data, and a checkpoint under
runs/ by default, for example runs/full_reinforce.log,
runs/full_reinforce.csv, and runs/full_reinforce.pt. Override those paths
with --log-file, --csv-file, and --model-file.
To load the hardcoded runs/full_reinforce.pt checkpoint and watch several
animated policy rollouts:
python watch_trained_policy.py
To generate one side-by-side results graph per variant from the saved CSV files:
python plot_reinforce_results.py
For a quick smoke test without opening the animation window:
python vanilla_reinforce.py --episodes 3 --max-steps 20 --no-animation
from platform_lander import PlatformLander
env = PlatformLander(enable_wind=True, wind_direction=(1, 0.2), wind_power=5.0)
obs, info = env.reset(seed=0)
for _ in range(1000):
action = env.action_space.sample()
obs, reward, terminated, truncated, info = env.step(action)
if terminated or truncated:
print(info)
break
env.close()
API Notes
PlatformLander(continuous=False)usesDiscrete(4)actions.- Actions:
0no-op,1upper-left attitude jet,2bottom engine,3upper-right attitude jet. continuous=Trueuses a two-valueBox(-1, 1, shape=(2,))action.- Wind is controlled with
enable_wind,wind_power,wind_direction, andset_wind(...). - The booster has 100 available jet fires by default. After they are exhausted, engine commands have no effect and the booster continues ballistically.
- The observation includes the fraction of jet fires remaining.
- The package provides local
BoxandDiscretespaces and does not import Gymnasium.
Publishing
Build the package from this directory:
python -m build
Upload the generated dist/platform_lander-*.tar.gz and
dist/platform_lander-*.whl files to PyPI with a PyPI account that owns the
platform_lander project name:
python -m twine upload dist/*
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file platform_lander-0.1.1.tar.gz.
File metadata
- Download URL: platform_lander-0.1.1.tar.gz
- Upload date:
- Size: 18.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1a7cf4232efe855f908395fe4974683b8b46240c51ea10a6e5a06589480f4825
|
|
| MD5 |
f04068d2faa18ad7e690dd20c49242cd
|
|
| BLAKE2b-256 |
725c7bb43700589be652aa8aa57e5f72193789301d7da9e34146490591f86077
|
File details
Details for the file platform_lander-0.1.1-py3-none-any.whl.
File metadata
- Download URL: platform_lander-0.1.1-py3-none-any.whl
- Upload date:
- Size: 15.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
404a9a2139852b9961e56ff7a27ae53510f6a8de97d8fe3b7229e18cf98deb42
|
|
| MD5 |
b4655328cb9341f538f5fd219e875224
|
|
| BLAKE2b-256 |
e1fc2d7d17b127b4c3ec2fdf9fb1ff49895b57768d91d36c6ce1ad658fb9a6f8
|