Skip to main content

A Python library for rubric-guided grading with LLMs

Project description

rubric-grader

A CLI tool and Python package to grade code submissions using LLM-based rubrics and ensemble code evaluation.

Features

  • Scoring modes:
    • A: One-step rubric-based evaluation (default).
    • B: Two-step rubric-based evaluation.
    • C: Ensemble code evaluation.
    • D: AI-O one-shot evaluation.
  • Programmatic API via the eval_submissions() function.
  • CLI entry point: rubric-grader.
  • Example smoke-test script in test/tester.py.

Installation

From PyPI

Install the latest released version from PyPI:

pip install  rubric-grader

From source (editable)

To install the latest development version directly from the GitHub repository, clone the repo and install in editable mode:

git clone https://github.com/arnavthestud/Rubric-Grader.git
cd Rubric-Grader
python3 -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -e .

Usage

Environment Variables

Before running the CLI or using the programmatic API, set your OpenAI API key:

export OPENAI_API_KEY="your_openai_api_key"

CLI

rubric-grader RUBRIC_FILE MODEL_SOLUTION_FILE PROBLEM_STATEMENT_FILE SUBMISSIONS_DIR [OPTIONS]

Example:

rubric-grader test/rubric.txt test/sol.txt test/prob.txt test/sub

Options:

--scoring_type {A,B,C,D}    Scoring mode: A = one-step rubric evaluation (default); B = two-step rubric evaluation; C = ensemble code evaluation; D = AI-O one-shot evaluation
--output_csv OUTPUT_CSV      Path for CSV results (default: results.csv)
--log_file LOG_FILE          Path for log file (default: evaluation.log)
--syntaxMarks SYNTAXMARKS    Maximum syntax marks (default: 5)
--penalty PENALTY            Penalty per syntax error (default: 1)
--ensemble_size ENSEMBLE     Ensemble size for scoring type C (default: 5)
--file_ext FILE_EXT          Submission file extension (default: .txt)
--debug                      Enable debug output

Programmatic API

You can also call the grading function directly from Python:

from llm_grader.code_evaluator import eval_submissions

eval_submissions(
    rubric_filepath='test/rubric.txt',
    model_solution_filepath='test/sol.txt',
    problem_statement_filepath='test/prob.txt',
    submissions_dir='test/sub',
    scoring_type='C',      # or 'B', 'D' (defaults to 'A' for one-step rubric evaluation if omitted)
    output_csv='output/results.csv',
    log_file='output/evaluation.log',
    syntaxMarks=5,
    penalty=1,
    ensemble_size=5,
    debug=False,
    file_ext='.java'
)

Example Test Script

A simple smoke-test script is provided at test/tester.py. Run it with:

python test/tester.py

Inspect test/tester.py to see how it imports eval_submissions() and sets up example file paths.

Scoring Types

  • A: One-step rubric evaluation Assigns scores based on a single rubric-based prompting phase.
  • B: Two-step rubric evaluation Uses the rubric file to assign scores in two phases (one-step parsing, then detailed rubric criteria).
  • C: Ensemble code evaluation Runs an ensemble of LLM queries (default size 5) to judge each submission’s correctness and syntax.
  • D: AI-O one-shot evaluation Performs a one-shot evaluation using the AI-O prompt for logical correctness and syntax.

Contributing

Contributions, issues, and feature requests are welcome. Feel free to open a pull request.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rubric_grader-0.1.0.tar.gz (13.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

rubric_grader-0.1.0-py3-none-any.whl (12.6 kB view details)

Uploaded Python 3

File details

Details for the file rubric_grader-0.1.0.tar.gz.

File metadata

  • Download URL: rubric_grader-0.1.0.tar.gz
  • Upload date:
  • Size: 13.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.8.0

File hashes

Hashes for rubric_grader-0.1.0.tar.gz
Algorithm Hash digest
SHA256 fd24cbe67a397b68ef3009172d7f431437230c19dc60162f80093d36cd462b03
MD5 8cfe3c10b96e797ecbc87040adf77e2f
BLAKE2b-256 c0d01047d97dfc0a62e764513fcb02595b06c681555cee5e08dd3e2561f7f2cd

See more details on using hashes here.

File details

Details for the file rubric_grader-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: rubric_grader-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 12.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.8.0

File hashes

Hashes for rubric_grader-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 0ecde858e5c565297ee541ab35ef678a0e70b468a1d1edf7bfb4a89c30fbab26
MD5 424953b8a17ae42f05486482ef1136bf
BLAKE2b-256 1f2fa76f905dd168d67186927e7c572af397b4d15cfdb95dda29211aa71e0176

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page