NYU CTF Dataset loader package
Project description
NYU CTF Bench
This repository hosts the NYU CTF Bench, a collection of CTF challenges from the CSAW CTF competitions, designed for evaluation of LLM agents. The CTF challenges are dockerized and easily deployable to allow an LLM-based automation framework to interact with the challenge and attempt a solution. The main benchmark dataset contains 200 challenges across 6 CTF categories: web, binary exploitation (pwn), forensics, reverse engineering (rev), cryptography (crypto), and miscellaneous (misc).
Benchmark structure
The test/ folder contains the main benchmark dataset of 200 challenges. A smaller development set of 55 challenges is present in the development/ folder.
The development set can be treated equivalent to a "train" split and used for building the agent, so that design decisions made to improve the agent do not bias the test scores.
The folder structure is as follows: <year>/<event>/<category>/<challenge>.
<year> is the year of the competition, <event> is either "CSAW-Quals" or "CSAW-Finals", <category> is among the 6 categories, and <challenge> is the challenge name.
Note that the challenge name may have spaces and single-quotes, so it is advisable to wrap it in double-quotes when using in scripts.
Each challenge contains a challenge.json containing the metadata of the challenge, and the corresponding challenge files.
Challenges that require a server to host some challenge files are set up with a docker image, and a docker-compose.yaml file.
The docker image is loaded directly using docker compose up.
Setup
Install the python package:
pip install nyuctf
The repository is automatically cloned when the CTFDataset is first instantiated with the split argument.
If needed, you can manually clone it by running:
python3 -m nyuctf.download
Usage
The following python snippet shows how to load challenge details using the python module:
from nyuctf.dataset import CTFDataset
from nyuctf.challenge import CTFChallenge
# Clones the repository for the first time, which takes a while
ds = CTFDataset(split="test")
chal = CTFChallenge(ds.get("2021f-rev-maze"), ds.basedir)
print(chal.name)
print(chal.flag)
print(chal.files)
Tests
Run tests on the challenges, for docker setup and network connection. Requires the docker network to be setup.
cd python
python -m unittest -v test.test_challenges
Optionally filter the tests with the unittest -k flag.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file nyuctf-1.1.1.tar.gz.
File metadata
- Download URL: nyuctf-1.1.1.tar.gz
- Upload date:
- Size: 5.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d2de54baff29e4b1f1a2030997e5527fa441ea80384e8c7dc47bab1ba91fe98d
|
|
| MD5 |
849dfa86230e6e81892f06de13a23aeb
|
|
| BLAKE2b-256 |
7d8764f88a445385a2c04e338912c96173bdb3c4e771a9bffa2df719fc5d97cb
|
File details
Details for the file nyuctf-1.1.1-py3-none-any.whl.
File metadata
- Download URL: nyuctf-1.1.1-py3-none-any.whl
- Upload date:
- Size: 7.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.10.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fdba1fbaa0c5fb396b134a82071569595d2e9123570ba6b872bb8616bd0d5db1
|
|
| MD5 |
e139eaaf52b1af0c175b5f56516e815f
|
|
| BLAKE2b-256 |
cd0a1cdc94e0c8c4a44efcd6966726db6f4693dbce9fef81c552be94c73fcb2e
|