Skip to main content

A turn-based, intent-level environment + scenario suite for evaluating LLM agents on FTL: Faster Than Light

Project description

ftl-bench

An agent-evaluation benchmark that has an LLM agent play FTL: Faster Than Light through a turn-based, intent-level interface built on the FTL-Hyperspace Lua API. The agent reads a decision-complete JSON observation and replies with one command; the harness scores how far it gets on a suite of reproducible, seed-pinned scenarios.

This package ships the Python harness, the scenario suite, the agents, and the ftlbench command line. Driving the real game additionally needs FTL installed (via Steam) plus the bench Hyperspace mod.

Install

pip install ftl-bench

Use

ftlbench run --agent scripted              # run the scenario suite with the scripted baseline
ftlbench run --agent random --tier public  # the legal-move floor on the public tier
ftlbench run --agent llm --backend anthropic --model claude-sonnet-4-6   # a model plays the suite
ftlbench play obs                          # print the live observation the agent sees
ftlbench install-mod --url <release-asset> # install the prebuilt bench Hyperspace mod into FTL
ftlbench version

ftlbench run --help and ftlbench play show the full options. Results and a reproducibility manifest are written under runs/benchmark/.

Platforms

The harness runs on native Windows, WSL, or macOS and launches FTL for you (via Steam on Windows). It reads/writes the FTL user folder, resolved per OS or overridden with FTL_SAVE_DIR.

More

Full design, architecture, and the in-game bridge live in the project repository: https://github.com/ogabrielluiz/ftl_bench.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ftl_bench-0.1.0.tar.gz (116.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ftl_bench-0.1.0-py3-none-any.whl (118.8 kB view details)

Uploaded Python 3

File details

Details for the file ftl_bench-0.1.0.tar.gz.

File metadata

  • Download URL: ftl_bench-0.1.0.tar.gz
  • Upload date:
  • Size: 116.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for ftl_bench-0.1.0.tar.gz
Algorithm Hash digest
SHA256 89b742059a9eac6f7e7d933759927018e3fe2d1fdf89bf7e6ccc9a5fc6f0f3f2
MD5 ca562885ccd1f2e89338b55090a0756d
BLAKE2b-256 a14cddb68014b0322e7e951e75552fe5ca20a6630248b692fa796a718348949a

See more details on using hashes here.

File details

Details for the file ftl_bench-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ftl_bench-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 118.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: uv/0.11.19 {"installer":{"name":"uv","version":"0.11.19","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for ftl_bench-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 021664cc091b58338ae992e2e230119ec85acb1df33c11ce5e9efb5e9f9e6413
MD5 460ee1cc1fa747ed0ac304dd93484138
BLAKE2b-256 124372254b98cac1c4e7673a4884d31248efe7653be35257b1281a15091356c1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page