Skip to main content

NetHack Prediction Benchmark - A dataset and dataloader for testing streaming learning algorithms on ordered NetHack game data

Project description

NetHack Prediction Benchmark

A dataset and dataloader for testing continual learning algorithms on ordered NetHack game data.

Installation

Install the package from PyPI:

pip install nle-prediction

Or install from source:

git clone <repository-url>
cd nethack_prediction_benchmark
pip install -e .

Quick Start

1. Setup Dataset

Download the NetHack Learning NAO dataset and create the database:

# Download all files and create dataset
nle-prediction --data-dir ./data/nld-nao

# Download only 2 files (at least 2 files are needed for testing)
nle-prediction --data-dir ./data/nld-nao --num-files 2

2. Use the Dataloader

from nle_prediction import OrderedNetHackDataloader

# Create dataloader (ordered dataset will be created automatically if missing)
# Database is expected at {data_dir}/ttyrecs.db
dataloader = OrderedNetHackDataloader(
    data_dir = "./data/nld-nao",
    batch_size = 32,
    format = "raw",  # or "one_hot"
)

# Iterate through batches
for batch in dataloader:
    # batch is a numpy array of shape (batch_size, 257, 24, 80) for one_hot format
    # or a dict with keys like 'tty_chars', 'tty_colors', etc. for raw format
    process_batch(batch)

Overview

This repository provides tools for creating a dataset based on the NetHack Learning NAO Dataset for testing supervised streaming learning algorithms. It also provides a dataloader that serves the data in a specific ordered sequence. At each step, the dataloader returns an observation and the change in score since the last step in the game.

Dataset Creation

The NetHack Learning NAO dataset contains 1.5 million games played by humans on nethack.alt.org. Though datasets like this are typically split up and randomly shuffled into an i.i.d. dataset, this repository instead creates an ordered dataset that maintains a specific ordering.

Games are first grouped by the player and sorted chronologically first by the start time of the game, and then by each step in the game so that temporal coherence is preserved. Players are then ordered by their mean score.

The overall hierarchy of sorting from top to bottom is: player mean score -> player name -> game start time -> game step.

The goal of the dataset is to have a challenging non-stationary problem that mirrors many of the attributes of real-world problems. NetHack ordered as described mimics how many real-world problems have different types of non-stationarities that change at different frequencies:

  • At the most granular level, the state of the game changes from step to step.
  • As the player progresses deeper into the game, new types of items, enemies, and scenarios are introduced.
  • Within the games of a single player, there may be consistent strategies used, but even those may change as a player progresses in skill.
  • As the games progress to more skilled players over time, the distribution of time spent at each floor level will change.

The many levels of non-stationarities in this dataset make it an excellent testbed for streaming learning algorithms.

Dataloader

The dataloader serves the dataset in the order described above. It provides options for:

  • Batch size: Number of samples per batch
  • Format: "raw" returns NLE-style dicts, "one_hot" returns preprocessed tensors (257, 24, 80)

API

OrderedNetHackDataloader(
    data_dir: str = "./data/nld-nao",
    dataset_name: str = "nld-nao-v0",
    batch_size: int = 1,
    format: Literal["raw", "one_hot"] = "raw",
    prefetch: int = 0,
    ordered_table: str = "ordered_games",
    auto_create_ordered: bool = True,
    min_games: Optional[int] = None,
)

Note: The database is expected to be at {data_dir}/ttyrecs.db. It will automatically be created there if you use the CLI setup tool.

Programmatic Usage

The setup tool should automatically handle downloading and creation of a dataset, everything that is needed for the dataloader to work. If, however, you need to manually download or create the dataset, you can use the functions below:

from nle_prediction import download_nld_nao, create_dataset, create_ordered_dataset
from pathlib import Path

# Download data
download_nld_nao(data_dir="./data/nld-nao", num_files=10)

# Create dataset (database will be at ./data/nld-nao/ttyrecs.db)
create_dataset(
    data_dir="./data/nld-nao",
    dataset_name="nld-nao-v0"
)

# Create ordered dataset (optional, done automatically by dataloader)
# Database is at ./data/nld-nao/ttyrecs.db
data_dir = Path("./data/nld-nao")
create_ordered_dataset(
    db_path=str(data_dir / "ttyrecs.db"),
    min_games=3,
    output_table="ordered_games",
    force=False
)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nle_prediction-0.1.0.tar.gz (21.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nle_prediction-0.1.0-py3-none-any.whl (19.5 kB view details)

Uploaded Python 3

File details

Details for the file nle_prediction-0.1.0.tar.gz.

File metadata

  • Download URL: nle_prediction-0.1.0.tar.gz
  • Upload date:
  • Size: 21.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for nle_prediction-0.1.0.tar.gz
Algorithm Hash digest
SHA256 d0f2e7807ff917ccb14a93a069ba15f3c5cb373baaac369af27ca9ee0c55e7b4
MD5 060d96e661934325e45d7e4cc218f0b1
BLAKE2b-256 dd1c14636261ef3b8085624183a8162e697788fc6dd9253298f77e5af4754124

See more details on using hashes here.

File details

Details for the file nle_prediction-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: nle_prediction-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 19.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.25

File hashes

Hashes for nle_prediction-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4913556c470035a29a1ba7121c509819dff7745d2b21cb7cf74fc116fd22d507
MD5 e2319fede09a4b982304387207231c66
BLAKE2b-256 08ff8fb0f561aa8b4af67e51d1f12473f6a583b92c25094168dd37dff84c3a6c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page