Skip to main content

NYU CTF Dataset loader package

Project description

NYU CTF Dataset

This repository hosts the NYU CTF Dataset, a collection of CTF challenges from the CSAW CTF competitions, designed for evaluation of LLM agents. The CTF challenges are dockerized and easily deployable to allow an LLM-based automation framework to interact with the challenge and attempt a solution. The main dataset contains 200 challenges across 6 CTF categories: web, binary exploitation (pwn), forensics, reverse engineering (rev), cryptography (crypto), and miscellaneous (misc).

Dataset structure

The test/ folder contains the main dataset of 200 challenges. A smaller development dataset of 55 challenges is present in the development/ folder. The development dataset can be treated equivalent to a "train" split and used for building the agent, so that design decisions made to improve the agent do not bias the test scores.

The folder structure is as follows: <year>/<event>/<category>/<challenge>. <year> is the year of the competition, <event> is either "CSAW-Quals" or "CSAW-Finals", <category> is among the 6 categories, and <challenge> is the challenge name. Note that the challenge name may have spaces and single-quotes, so it is advisable to wrap it in double-quotes when using in scripts.

Each challenge contains a challenge.json containing the metadata of the challenge, and the corresponding challenge files. Challenges that require a server to host some challenge files are set up with a docker image, and a docker-compose.yaml file. The docker image is loaded directly using docker compose up.

Setup

Install the python package:

pip install nyuctf

The repository is automatically cloned when the CTFDataset is first instantiated with the split argument. If needed, you can manually clone it by running:

python3 -m nyuctf.download

Usage

The following python snippet shows how to load challenge details using the python module:

from nyuctf.dataset import CTFDataset
from nyuctf.challenge import CTFChallenge

# Clones the repository for the first time, which takes a while
ds = CTFDataset(split="test")
chal = CTFChallenge(ds.get("2021f-rev-maze"), ds.basedir)

print(chal.name)
print(chal.flag)
print(chal.files)

Tests

Run tests on the challenges, for docker setup and network connection. Requires the docker network to be setup.

cd python
python -m unittest -v test.test_challenges

Optionally filter the tests with the unittest -k flag.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nyuctf-1.0.tar.gz (5.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nyuctf-1.0-py3-none-any.whl (7.1 kB view details)

Uploaded Python 3

File details

Details for the file nyuctf-1.0.tar.gz.

File metadata

  • Download URL: nyuctf-1.0.tar.gz
  • Upload date:
  • Size: 5.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.12

File hashes

Hashes for nyuctf-1.0.tar.gz
Algorithm Hash digest
SHA256 c5e5ed8b042b3b5e4c642f23586d09d6133fe5ccb1c85cda83b153059ab70ed6
MD5 4911355c1991daed9557b1ed3149a042
BLAKE2b-256 e7b80ecd2118a0f90fef64ba89c33d2c5319d759263f46661d1258d7b6592502

See more details on using hashes here.

File details

Details for the file nyuctf-1.0-py3-none-any.whl.

File metadata

  • Download URL: nyuctf-1.0-py3-none-any.whl
  • Upload date:
  • Size: 7.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.12

File hashes

Hashes for nyuctf-1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b06dc8e95774a61507f3394d34dd780ea0025f3e8a62ece0dd1283cd415aec84
MD5 7151cbe21165e46a23f79544eecca75d
BLAKE2b-256 aecc9f32679571221f142bbf8723b1416cfb9605a45b3cd0af2d909875230eb7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page