No project description provided
Project description
SPaRC: Spatial Pathfinding and Reasoning Challenge
A comprehensive toolkit for spatial reasoning puzzle solving and model evaluation
Overview
SPaRC provides a comprehensive framework for evaluating language models on spatial reasoning tasks inspired by "The Witness" puzzle game. This package includes tools for dataset processing, solution validation, and model evaluation with beautiful terminal output.
Installation
Install the package from PyPI:
pip install sparc-puzzle
Or install from source:
git clone https://github.com/lkaesberg/SPaRC.git
cd SPaRC
pip install -e .
Quick Start
1. Testing a Model on the Dataset
Run the complete benchmark on your model:
sparc --api-key "your-openai-api-key" --model "gpt-4" --batch-size 5
Key Features:
- 🔄 Resume Support: Automatically saves progress and resumes from where you left off
- ⚡ Batching: Process multiple puzzles concurrently for faster evaluation
- 🎨 Rich Output: Beautiful terminal interface with progress tracking
- 🛑 Graceful Shutdown: Press Ctrl+C to stop after current batch
Example with different endpoints:
# OpenAI API
sparc --api-key "sk-..." --model "gpt-4"
# Custom endpoint (e.g., local model)
sparc --api-key "your-key" --base-url "http://localhost:8080/v1" --model "llama-3.3-70b"
# Resume interrupted session
sparc --api-key "your-key" --model "gpt-4" # Automatically resumes
# Fresh start (ignore previous results)
sparc --api-key "your-key" --model "gpt-4" --fresh-start
2. Using the Validation API
Use SPaRC's validation functions in your own code:
from sparc.validation import extract_solution_path, validate_solution, analyze_path
from sparc.prompt import prompt
from datasets import load_dataset
# Load the dataset
dataset = load_dataset("lkaesberg/SPaRC", "all", split="test")
puzzle = dataset[0]
# Generate prompt for your model
puzzle_prompt = prompt(puzzle)
# Your model generates a response
model_response = "... model response with path coordinates ..."
# Extract the path from model response
extracted_path = extract_solution_path(model_response, puzzle)
# Returns: [{"x": 0, "y": 2}, {"x": 0, "y": 1}, ...]
# Validate against ground truth
is_correct = validate_solution(extracted_path, puzzle)
# Returns: True/False
# Get detailed analysis
analysis = analyze_path(extracted_path, puzzle)
# Returns: {
# "starts_at_start_ends_at_exit": True,
# "connected_line": True,
# "non_intersecting_line": True,
# "no_rule_crossing": True,
# "fully_valid_path": True
# }
CLI Reference
Basic Usage
sparc --api-key "your-key" [OPTIONS]
Options
| Option | Default | Description |
|---|---|---|
--api-key |
Required | OpenAI API key or your model's API key |
--base-url |
https://api.openai.com/v1 |
API endpoint URL |
--model |
gpt-4 |
Model name to evaluate |
--temperature |
1.0 |
Generation temperature |
--batch-size |
5 |
Number of concurrent requests |
--results-file |
sparc_results.json |
File to save results |
--fresh-start |
False |
Ignore existing results and start over |
--verbose |
False |
Show detailed output for each puzzle |
Examples
# Basic evaluation
sparc --api-key "sk-..." --model "gpt-4"
# High throughput with larger batches
sparc --api-key "sk-..." --model "gpt-3.5-turbo" --batch-size 20
# Conservative approach with lower temperature
sparc --api-key "sk-..." --model "gpt-4" --temperature 0.1
# Verbose output to see each puzzle result
sparc --api-key "sk-..." --model "gpt-4" --verbose
# Custom results file
sparc --api-key "sk-..." --model "claude-3" --results-file "claude_results.json"
Core Functions
extract_solution_path(solution_text: str, puzzle_data: Dict) -> List[Dict[str, int]]
Extracts coordinate path from model response text.
validate_solution(extracted_path: List[Dict[str, int]], puzzle_data: Dict) -> bool
Validates if the extracted path matches any ground truth solution.
analyze_path(solution_path: List[Dict[str, int]], puzzle: Dict) -> Dict
Provides detailed analysis of path validity and rule compliance.
prompt(puzzle_data: Dict) -> str
Generates the formatted prompt for a puzzle.
Contributing
Contributions are welcome! Please feel free to submit pull requests or open issues.
Citation
If you use SPaRC in your research, please cite:
@article{kaesberg2025sparc,
title = {SPaRC: A Spatial Pathfinding Reasoning Challenge},
author = {Kaesberg, Lars Benedikt and Wahle, Jan Philip and Ruas, Terry and Gipp, Bela},
year = {2025},
url = {https://arxiv.org/abs/2505.16686}
}
License
This project is licensed under the MIT License - see the LICENSE file for details.
Links
- 🌐 Website: sparc.gipplab.org
- 📚 Dataset: Hugging Face
- 🐛 Issues: GitHub Issues
- 📖 Documentation: GitHub Repository
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sparc_puzzle-0.2.2.tar.gz.
File metadata
- Download URL: sparc_puzzle-0.2.2.tar.gz
- Upload date:
- Size: 15.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.23
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
96d16446db751dd799ff0a0750081931b0058f72a302b03fe5e4c1069dd1ac00
|
|
| MD5 |
203ece81224d72502378dbf9ed0d8665
|
|
| BLAKE2b-256 |
25fa70266c99658d05bb11249e1ed9a159eeae9dbd1f467b6156f3805efda8cb
|
File details
Details for the file sparc_puzzle-0.2.2-py3-none-any.whl.
File metadata
- Download URL: sparc_puzzle-0.2.2-py3-none-any.whl
- Upload date:
- Size: 15.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.9.23
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
41ede4b0141c24ecd331b324b0ac3cece65677d0dd3b463f8ebd759823aeb543
|
|
| MD5 |
4bd190c41832957e20672a427ec0e0ef
|
|
| BLAKE2b-256 |
8d8b644f314ffde3228612acb385c55dbcd07feb9954e189eebd59c8c9bd79d7
|