py-openjudge

OpenJudge: A Next-Generation Evaluation System for AI Model Assessment

These details have not been verified by PyPI

Project links

Project description

Open-Judge Logo

Holistic Evaluation, Quality Rewards: Driving Application Excellence

Documentation | Contributing | 中文

News

2025-10-20 - Auto-Rubric: Learning to Extract Generalizable Criteria for Reward Modeling - We released a new paper on learning generalizable reward criteria for robust modeling.
2025-10-17 - Taming the Judge: Deconflicting AI Feedback for Stable Reinforcement Learning - We introduced techniques to align judge feedback and improve RL stability.
2025-07-09 - Released OpenJudge v0.1.0 on PyPI

Evaluation and reward signals are the cornerstones of application excellence. Holistic evaluation enables the systematic analysis of shortcomings to drive rapid iteration, while high-quality rewards provide the essential foundation for advanced optimization and fine-tuning. Open-Judge unifies reward signals and evaluation metrics into one Grader interface—with pre-built graders, flexible customization, and seamless framework integration.

Key Features

Systematic & Quality-Assured Grader Library: Access N+ production-ready graders featuring a comprehensive taxonomy, rigorously validated for reliable performance.
- Multi-Scenario Coverage: Extensive support for diverse domains including Agent, text, code, math, and multimodal tasks via specialized graders.
- Holistic Agent Evaluation: Beyond final outcomes, we assess the entire lifecycle—including trajectories and specific components (Memory, Reflection, Tool Use).
- Quality Assurance: Built for reliability. Every grader comes with benchmark datasets and pytest integration for immediate quality validation.
Flexible Grader Building Methods: Choose the build method that fits your requirements:
- Customization: Easily extend or modify pre-defined graders to fit your specific needs.
- Data-Driven Rubrics: Have a few examples but no clear rules? Use our tools to automatically generate white-box evaluation criteria (Rubrics) based on your data.
- Trainable Judge Models: For high-scale scenarios, train dedicated Judge models as Graders. We support SFT, Bradley-Terry models, and Reinforcement Learning workflows.
Easy Integration: Seamlessly connect with mainstream evaluation platforms (e.g., LangSmith, LangFuse) and training frameworks (e.g., VERL) using our comprehensive tutorials and flexible APIs.

Installation

pip install open_judge

More installation methods can be found in the here.

Quickstart

import asyncio
from open_judge.models import OpenAIChatModel
from open_judge.graders.common.relevance import RelevanceGrader


# step1 create model client
model = OpenAIChatModel(model="qwen3-32b")

# step2 choose and initialize proper grader
grader = RelevanceGrader(model=model)

# step3 Prepare data

data = {
    "query": "What is machine learning?",
    "response": "Machine learning is a subset of AI that enables computers to learn from data.",
}

# step 4 Evaluate using the data
result = await grader.aevaluate(**data)

print(f"Score: {result.score}")  # Score: 5
print(f"Reason: {result.reason}")

Complete Quickstart can be found in here.

Integrations

Integration	Documentation
LangSmith	LangSmith
LangFuse	LangFuse
Arize Phoenix	Arize Phoenix

Contributing

We welcome contributions from the community!

Raise and comment on Issues.
Open a PR - Whether you're fixing bugs, adding new features, improving documentation, or sharing ideas, your contributions help make Open-Judge better for everyone. See Contributing for more details.

Citation

If you use Open-Judge in your research, please cite:

@software{
title = {OpenJudge: XXXX},
author = {The Open-Judge Team},
url = {https://github.com/modelscope/Open-Judge},
month = {07},
year = {2025}
}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.4

May 15, 2026

0.2.3

Mar 4, 2026

0.2.2

Feb 12, 2026

0.2.1

Jan 21, 2026

0.2.0

Dec 26, 2025

0.1.8

Dec 26, 2025

This version

0.1.7

Dec 25, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

py_openjudge-0.1.7.tar.gz (276.0 kB view details)

Uploaded Dec 25, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

py_openjudge-0.1.7-py3-none-any.whl (433.8 kB view details)

Uploaded Dec 25, 2025 Python 3

File details

Details for the file py_openjudge-0.1.7.tar.gz.

File metadata

Download URL: py_openjudge-0.1.7.tar.gz
Upload date: Dec 25, 2025
Size: 276.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.9

File hashes

Hashes for py_openjudge-0.1.7.tar.gz
Algorithm	Hash digest
SHA256	`958a19f2af630ede91e82d729b3ea20ad7ba088a7987532222974733d75af812`
MD5	`24fb79cb1b7116533a385a6c4404fe53`
BLAKE2b-256	`9a0c08e62db8b9a99e80223d1c0f061bbf9666a862cf7552f0fc95fd39b00be2`

See more details on using hashes here.

File details

Details for the file py_openjudge-0.1.7-py3-none-any.whl.

File metadata

Download URL: py_openjudge-0.1.7-py3-none-any.whl
Upload date: Dec 25, 2025
Size: 433.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.7.9

File hashes

Hashes for py_openjudge-0.1.7-py3-none-any.whl
Algorithm	Hash digest
SHA256	`42a8fa08a1ce68cace47bb3f4161eb573fefddfd8a795e2247790e6a5672b13a`
MD5	`ef9b536b15c78b8fc5a3abecab335b66`
BLAKE2b-256	`93e9dfd6889e022df6960d7c872b2300e0dc0104ae4cf7b1d1cfa98a7569bd0a`

See more details on using hashes here.

py-openjudge 0.1.7

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

News

Key Features

Installation

Quickstart

Integrations

Contributing

Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes