LLM-powered Jupyter notebook grading tool with nbgrader compatibility

Project description

sglnbgrader: LLM-Assisted Jupyter Notebook Grader

This project provides an automated grading system for Jupyter notebooks that uses Large Language Models (LLMs) to assess student answers against instructor-provided reference solutions. It's compatible with the nbgrader metadata format.

Features

Grade notebooks with nbgrader metadata
Compare student answers to reference solutions using LLMs
Generate detailed feedback and add it directly to notebook cells
Provide comprehensive scoring and analysis
Support for both single notebook and batch grading
Analyze consistency and fairness across multiple submissions
Export grading results to JSON or HTML-enhanced notebooks
Customizable LLM prompts and grading criteria

Installation

Prerequisites

Python 3.12 or higher
An OpenAI API key for access to GPT-4 models (or other models via LiteLLM)

Installation Options

Option 1: Install as a standalone tool with uv (recommended)

The fastest and easiest way to install is using uv tool install:

# Install directly as a standalone tool
uv tool install sglnbgrader

# This makes the command available globally without activating any environment

Option 2: Install from PyPI

# Using pip
pip install sglnbgrader

# Using uv pip
uv pip install sglnbgrader

Option 3: Install from source

Clone the repository:

git clone https://github.com/yourusername/sglnbgrader.git
cd sglnbgrader

Install the package:

# Using pip
pip install -e .

# Using uv
uv pip install -e .

API Key Setup

Set up your OpenAI API key as an environment variable:

# Linux/macOS
export OPENAI_API_KEY=your_api_key_here

# Windows
set OPENAI_API_KEY=your_api_key_here

# Or add to your .bashrc or .zshrc for persistence
echo 'export OPENAI_API_KEY=your_api_key_here' >> ~/.bashrc

Usage

Command-line Interface

The grader provides a command-line interface with two main commands:

Grade a Single Notebook

sglnbgrader single --answer path/to/answer_notebook.ipynb --student path/to/student_notebook.ipynb --output results.json --verbose

Or using the Python module:

python -m sglnbgrader single --answer path/to/answer_notebook.ipynb --student path/to/student_notebook.ipynb --output results.json --verbose

Options:

--answer: Path to instructor's answer notebook (required)
--student: Path to student notebook (required)
--model: LLM model to use for grading (default: gpt-4.1-nano)
--output: Path to save grading results as JSON (optional)
--verbose, -v: Show detailed grading information

Grade Multiple Notebooks

sglnbgrader batch --answer path/to/answer_notebook.ipynb --submissions path/to/submissions_dir --output path/to/results_dir --verbose

Or using the Python module:

python -m sglnbgrader batch --answer path/to/answer_notebook.ipynb --submissions path/to/submissions_dir --output path/to/results_dir --verbose

Options:

--answer: Path to instructor's answer notebook (required)
--submissions: Directory containing student submissions (required)
--model: LLM model to use for grading (default: gpt-4.1-nano)
--output: Directory to save grading results (optional)
--verbose, -v: Show detailed grading information

API Usage

from sglnbgrader import Grader

# Initialize the grader with the instructor's answer notebook
grader = Grader("path/to/answer_notebook.ipynb", model="gpt-4.1-nano")

# Grade a single student notebook
results = grader.grade_user_notebook("path/to/student_notebook.ipynb")

# Print the results
print(f"Total score: {results['total_score']}/{results['max_score']} ({results['percentage']}%)")

# Access individual question results
for result in results["results"]:
    print(f"Question {result['grade_id']}: {result['score']}/{result['max_score']}")
    print(f"Feedback: {result['feedback']}")
    
# Generate feedback in the notebook
feedback_notebook_path = grader.write_feedback_to_notebook(
    "path/to/student_notebook.ipynb", results
)
print(f"Feedback notebook created at: {feedback_notebook_path}")

# Compare multiple submissions
submission_paths = [
    "path/to/student1_notebook.ipynb",
    "path/to/student2_notebook.ipynb",
    "path/to/student3_notebook.ipynb",
]
comparison_results = grader.compare_student_submissions(submission_paths)

# Run benchmarks on the grading system
benchmark_results = grader.run_benchmarks(results, submission_paths)

Notebook Format Requirements

This grader works with notebooks that use the nbgrader metadata format:

Cells that should be graded must have nbgrader metadata
Required metadata fields: grade_id, grade: true, points

Example cell metadata:

{
  "metadata": {
    "nbgrader": {
      "grade": true,
      "grade_id": "question-1",
      "points": 10,
      "solution": true
    }
  }
}

Advanced Features

Writing Feedback to Notebooks

The write_feedback_to_notebook method adds HTML-formatted feedback directly into the notebook cell outputs:

feedback_notebook_path = grader.write_feedback_to_notebook(
    "path/to/student_notebook.ipynb", results
)

This creates a new notebook with:

HTML feedback boxes in each graded cell
A summary cell at the end with total score and breakdown
Preserves all original content

Comparing Student Submissions

The compare_student_submissions method analyzes results across multiple submissions:

comparison_results = grader.compare_student_submissions([
    "path/to/student1_notebook.ipynb",
    "path/to/student2_notebook.ipynb",
])

This provides:

Statistics for each question (mean, median, standard deviation)
Overall class performance metrics
Consistency measures between different submissions

Benchmarking Grading Quality

The run_benchmarks method validates the grading system's consistency and fairness:

benchmark_results = grader.run_benchmarks(reference_results, submission_paths)

This analyzes:

Consistency relative to reference results
Fairness of scoring across different questions
Performance metrics for the grading system

Configuration

You can customize the LLM model and prompt by extending the Grader class or modifying the prompt property:

class CustomGrader(Grader):
    @property
    def prompt(self):
        return """
        Your custom prompt template here.
        Question: {question}
        Reference Answer: {reference_answer}
        Student Answer: {student_answer}
        Points: {points}
        """

Development

Testing

Run the tests using pytest:

# Using pytest directly
pytest

# Using uv
uv run pytest

Project Structure

sglnbgrader/ - Main package
- __init__.py - Package exports
- grader.py - Core Grader class implementation
- cli.py - Command-line interface
- __main__.py - Entry point for running as module
tests/ - Test suite

License

MIT

Acknowledgements

Uses nbgrader metadata format for identifying graded cells
Powered by OpenAI and other LLM providers through LiteLLM
CLI built with Typer and Rich for beautiful console output

Project details

Release history Release notifications | RSS feed

This version

0.1.0

Apr 28, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sglnbgrader-0.1.0.tar.gz (15.2 kB view details)

Uploaded Apr 28, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sglnbgrader-0.1.0-py3-none-any.whl (14.0 kB view details)

Uploaded Apr 28, 2025 Python 3

File details

Details for the file sglnbgrader-0.1.0.tar.gz.

File metadata

Download URL: sglnbgrader-0.1.0.tar.gz
Upload date: Apr 28, 2025
Size: 15.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.6.14

File hashes

Hashes for sglnbgrader-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`a1da5dd81029e22cee740211f5044fa3348ef24ff101672d3ebaded299a6e964`
MD5	`9f3e356bdb4c2f1a8918365c75413f20`
BLAKE2b-256	`213ffe773fb80a27e4220694cb4fa07e11321552e323eb809091b6f1395d4e72`

See more details on using hashes here.

File details

Details for the file sglnbgrader-0.1.0-py3-none-any.whl.

File metadata

Download URL: sglnbgrader-0.1.0-py3-none-any.whl
Upload date: Apr 28, 2025
Size: 14.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.6.14

File hashes

Hashes for sglnbgrader-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`827480b28e2a903d4a16383bc13ae3bff30e2ee963e4c8769dc596f01f516e0a`
MD5	`8a790609f490e2166ada7ec4f905bc5c`
BLAKE2b-256	`8df23b2a423983fcd8eea19a2e92b83a350132cf52d69a4c131ee91310c4695f`

See more details on using hashes here.

sglnbgrader 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

sglnbgrader: LLM-Assisted Jupyter Notebook Grader

Features

Installation

Prerequisites

Installation Options

API Key Setup

Usage

Command-line Interface

Grade a Single Notebook

Grade Multiple Notebooks

API Usage

Notebook Format Requirements

Advanced Features

Writing Feedback to Notebooks

Comparing Student Submissions

Benchmarking Grading Quality

Configuration

Development

Testing

Project Structure

License

Acknowledgements

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes