No project description provided

These details have not been verified by PyPI

Project description

SPaRC Logo

SPaRC: Spatial Pathfinding and Reasoning Challenge

A comprehensive toolkit for spatial reasoning puzzle solving and model evaluation

🌐 Visit the Website | 🤗 Dataset on Hugging Face | 📦 PyPI Package

Overview

SPaRC provides a comprehensive framework for evaluating language models on spatial reasoning tasks inspired by "The Witness" puzzle game. This package includes tools for dataset processing, solution validation, and model evaluation with beautiful terminal output.

Installation

Install the package from PyPI:

pip install sparc-puzzle

Or install from source:

git clone https://github.com/lkaesberg/SPaRC.git
cd SPaRC
pip install -e .

Quick Start

1. Testing a Model on the Dataset

Run the complete benchmark on your model:

sparc --api-key "your-openai-api-key" --model "gpt-4" --batch-size 5

Key Features:

🔄 Resume Support: Automatically saves progress and resumes from where you left off
⚡ Batching: Process multiple puzzles concurrently for faster evaluation
🎨 Rich Output: Beautiful terminal interface with progress tracking
🛑 Graceful Shutdown: Press Ctrl+C to stop after current batch

Example with different endpoints:

# OpenAI API
sparc --api-key "sk-..." --model "gpt-4"

# Custom endpoint (e.g., local model)
sparc --api-key "your-key" --base-url "http://localhost:8080/v1" --model "llama-3.3-70b"

# Resume interrupted session
sparc --api-key "your-key" --model "gpt-4"  # Automatically resumes

# Fresh start (ignore previous results)
sparc --api-key "your-key" --model "gpt-4" --overwrite

2. Using the Validation API

Use SPaRC's validation functions in your own code:

from sparc.validation import extract_solution_path, validate_solution, analyze_path
from sparc.prompt import generate_prompt
from datasets import load_dataset

# Load the dataset
dataset = load_dataset("lkaesberg/SPaRC", "all", split="test")
puzzle = dataset[0]

# Generate prompt for your model
puzzle_prompt = [
                  {
                    "role": "system",
                    "content": "You are an expert at solving puzzles games.",
                  },
                  {
                    "role": "user", 
                    "content": generate_prompt(puzzle)
                  }
                ]


# Your model generates a response
model_response = "... model response with path coordinates ..."

# Extract the path from model response
extracted_path = extract_solution_path(model_response, puzzle)
# Returns: [{"x": 0, "y": 2}, {"x": 0, "y": 1}, ...]

# Validate against ground truth
is_correct = validate_solution(extracted_path, puzzle)
# Returns: True/False

# Get detailed analysis
analysis = analyze_path(extracted_path, puzzle)
# Returns: {
#   "starts_at_start_ends_at_exit": True,
#   "connected_line": True,
#   "non_intersecting_line": True,
#   "no_rule_crossing": True,
#   "fully_valid_path": True
# }

CLI Reference

Basic Usage

sparc --api-key "your-key" [OPTIONS]

Options

Option	Default	Description
`--api-key`	Required	OpenAI API key or your model's API key
`--base-url`	`https://api.openai.com/v1`	API endpoint URL
`--model`	`gpt-4`	Model name to evaluate
`--temperature`	`1.0`	Generation temperature
`--batch-size`	`5`	Number of concurrent requests
`--results-file`	`<model>.jsonl`	File to save results
`--overwrite`	`False`	Ignore existing results and start over
`--verbose`	`False`	Show detailed output for each puzzle
`--max-new`	`None`	Process at most this many new puzzles
`--gym`	`False`	Use step-by-step gym mode instead of single-shot
`--gym-traceback`	`False`	Enable traceback visualization in gym mode
`--run-name`	`None`	Suffix for output filename (e.g., `experiment1`)

Examples

# Basic evaluation
sparc --api-key "sk-..." --model "gpt-4"

# High throughput with larger batches
sparc --api-key "sk-..." --model "gpt-3.5-turbo" --batch-size 20

# Conservative approach with lower temperature
sparc --api-key "sk-..." --model "gpt-4" --temperature 0.1

# Verbose output to see each puzzle result
sparc --api-key "sk-..." --model "gpt-4" --verbose

# Custom results file
sparc --api-key "sk-..." --model "claude-3" --results-file "claude_results.json"

# Process only 10 new puzzles
sparc --api-key "sk-..." --model "gpt-4" --max-new 10

# Step-by-step gym mode (agent receives feedback after each move)
sparc --api-key "sk-..." --model "gpt-4" --gym

# Gym mode with traceback (shows path history in observations)
sparc --api-key "sk-..." --model "gpt-4" --gym --gym-traceback

# Named experiment run
sparc --api-key "sk-..." --model "gpt-4" --run-name "experiment1"

Core Functions

`extract_solution_path(solution_text: str, puzzle_data: Dict) -> List[Dict[str, int]]`

Extracts coordinate path from model response text.

`validate_solution(extracted_path: List[Dict[str, int]], puzzle_data: Dict) -> bool`

Validates if the extracted path matches any ground truth solution.

`analyze_path(solution_path: List[Dict[str, int]], puzzle: Dict) -> Dict`

Provides detailed analysis of path validity and rule compliance.

`generate_prompt(puzzle_data: Dict) -> str`

Generates the formatted prompt for a puzzle.

Contributing

Contributions are welcome! Please feel free to submit pull requests or open issues.

Citation

If you use SPaRC in your research, please cite:

@inproceedings{kaesberg-etal-2025-sparc,
    title = "{SP}a{RC}: A Spatial Pathfinding Reasoning Challenge",
    author = "Kaesberg, Lars Benedikt and Wahle, Jan Philip and Ruas, Terry and Gipp, Bela",
    booktitle = "Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2025",
    address = "Suzhou, China",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2025.emnlp-main.526/",
    doi = "10.18653/v1/2025.emnlp-main.526",
    pages = "10370--10401"
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.5.9

Apr 28, 2026

0.5.8

Apr 20, 2026

0.5.7

Mar 10, 2026

This version

0.5.6

Feb 16, 2026

0.5.5

Feb 16, 2026

0.5.4

Feb 15, 2026

0.5.3

Feb 5, 2026

0.5.2

Jan 26, 2026

0.5.1

Jan 26, 2026

0.5.0

Jan 12, 2026

0.4.13

Jan 12, 2026

0.4.12

Jan 8, 2026

0.4.11

Jan 5, 2026

0.4.10

Jan 4, 2026

0.4.9

Jan 4, 2026

0.4.8

Jan 4, 2026

0.4.7

Jan 4, 2026

0.4.6

Jan 4, 2026

0.4.5

Jan 4, 2026

0.4.4

Jan 3, 2026

0.4.3

Jan 3, 2026

0.4.2

Jan 3, 2026

0.4.1

Jan 3, 2026

0.4.0

Jan 3, 2026

0.3.4

Nov 10, 2025

0.3.3

Oct 20, 2025

0.3.2

Oct 17, 2025

0.3.1

Jul 15, 2025

0.3.0

Jun 28, 2025

0.2.5

Jun 27, 2025

0.2.4

Jun 12, 2025

0.2.3

Jun 12, 2025

0.2.2

Jun 12, 2025

0.2.1

Jun 11, 2025

0.2.0

Jun 11, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sparc_puzzle-0.5.6.tar.gz (30.6 kB view details)

Uploaded Feb 16, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sparc_puzzle-0.5.6-py3-none-any.whl (41.5 kB view details)

Uploaded Feb 16, 2026 Python 3

File details

Details for the file sparc_puzzle-0.5.6.tar.gz.

File metadata

Download URL: sparc_puzzle-0.5.6.tar.gz
Upload date: Feb 16, 2026
Size: 30.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sparc_puzzle-0.5.6.tar.gz
Algorithm	Hash digest
SHA256	`54ab3608703375bd8dcf9ed3cad61227319107c5727b4dd22f402fa31c536182`
MD5	`7a5cb63e3f8e5a893fecbf77412eeeaa`
BLAKE2b-256	`33370fd5708fc9f1ed2a25852c952c83566fea81f2a31f886540178625c72b91`

See more details on using hashes here.

File details

Details for the file sparc_puzzle-0.5.6-py3-none-any.whl.

File metadata

Download URL: sparc_puzzle-0.5.6-py3-none-any.whl
Upload date: Feb 16, 2026
Size: 41.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sparc_puzzle-0.5.6-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7f39d78ab415e9b245b541636d6714afe491904b575fb2d5532c7853eb152199`
MD5	`8921762ae3f2122d6b999430c0be219d`
BLAKE2b-256	`1d9e8edff5e3a0724778870828ef2e21993d7c731ab5e8a877c63d7158315c09`

See more details on using hashes here.

sparc-puzzle 0.5.6

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

SPaRC: Spatial Pathfinding and Reasoning Challenge

Overview

Installation

Quick Start

1. Testing a Model on the Dataset

2. Using the Validation API

CLI Reference

Basic Usage

Options

Examples

Core Functions

extract_solution_path(solution_text: str, puzzle_data: Dict) -> List[Dict[str, int]]

validate_solution(extracted_path: List[Dict[str, int]], puzzle_data: Dict) -> bool

analyze_path(solution_path: List[Dict[str, int]], puzzle: Dict) -> Dict

generate_prompt(puzzle_data: Dict) -> str

Contributing

Citation

License

Links

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`extract_solution_path(solution_text: str, puzzle_data: Dict) -> List[Dict[str, int]]`

`validate_solution(extracted_path: List[Dict[str, int]], puzzle_data: Dict) -> bool`

`analyze_path(solution_path: List[Dict[str, int]], puzzle: Dict) -> Dict`

`generate_prompt(puzzle_data: Dict) -> str`