A multi-model ensemble system using triangular review for fact-checking and verification

These details have not been verified by PyPI

Project links

Project description

Model Court: A Multi-Model Ensemble Framework for Verification

Project Overview (V0.0.2)

Model Court is an open-source framework designed to make cross-verification and fact-checking with multiple models easier. Model Court is inspired by concepts from the U.S. courtroom system, using the roles of Prosecutor, Jury, and Judge to verify facts, with support for internet search, RAG-based retrieval, and more.
The current version is 0.0.2, released on 2025-11-30, and provides the basic core functionality.

Model Court performs AI content verification using a courtroom-style process:

Prosecutor: Preprocesses the case, queries historical precedents. If a valid precedent already exists and has not expired, the result is returned directly without entering a full trial.
Jury: Multiple independent LLM evaluators. Each juror is independent; it is recommended that each uses different retrieval tools and models from different providers.
Judge: Aggregates votes and produces the final verdict, which is then stored as a precedent in the precedent database.

This courtroom-style process can improve reliability in scenarios where LLM outputs need to be verified, such as:

Fact-checking: Determining the factual accuracy of news and social media content.
Content moderation: Detecting harmful, violating, or misleading content.
Knowledge Q&A: Verifying the correctness of AI-generated answers.
Academic research: Improving robustness via multi-model ensemble.
Compliance checking: Verifying whether content complies with certain rules or standards.

The basic courtroom flow is as follows:

Case Input → Prosecutor → [Juror1, Juror2, ..., JurorN] → Judge → Verdict
                ↓                         ↓
         Precedent Database          Reference Sources
           (Past Rulings)              (Evidence)

For the full courtroom process, see the detailed introduction below.

Installation

This project is published on PyPI and can be installed via pip.

Install

# Install core package (minimal dependencies)
pip install model-court

# Or install the full version (includes all LLM, RAG, search features)
pip install model-court[full]

# Development install (from source)
pip install -e .
pip install -e .[full]  # Full version from source

Note: The package name is model-court (with a hyphen), but the import name is model_court (with an underscore).

Detailed Introduction

Full Courtroom Workflow

The full courtroom workflow is as follows:

┌───────────────────────────────────────────┐
│               🏛️ Model Court             │
│              Main Courtroom Flow         │
└───────────────────────────────────────────┘
                     │
                     ▼
        ┌──────────────────────────┐
        │     Input Case Text      │
        └──────────────────────────┘
                     │
                     ▼
   ┌───────────────────────────────────────┐
   │        1. Prosecutor (Prosecutor)     │
   ├───────────────────────────────────────┤
   │ • Optionally split the case into      │
   │   multiple claims (if enabled)        │
   │ • Query precedent DB (SQL + Vector)   │
   │   to avoid redundant evaluation       │
   │     - Cache hit → return past ruling  │
   │     - Similar precedent → provide     │
   │       as reference to the Judge       │
   └───────────────────────────────────────┘
                     │
                     ▼
┌─────────────────────────────────────────────┐
│     2. Launch Multiple Juries in Parallel   │
└─────────────────────────────────────────────┘
                     │
                     ▼
   ┌──────────────────────────────────────────────┐
   │           🧑‍⚖️  Jury Voting Process           │
   ├──────────────────────────────────────────────┤
   │ Cross-validate claims using independent LLMs │
   │ to reduce hallucinations.                    │
   │ Each jury can focus on different criteria,   │
   │ either purely model-based or using pluggable │
   │ reference sources:                           │
   │                                              │
   │ ① Logical review: evaluate based on logic    │
   │    and common sense only.                    │
   │ ② Web search: validate claims using real-    │
   │    time web search (supports iterative       │
   │    verification).                            │
   │ ③ RAG: use integrated RAG pipeline; Model    │
   │    Court handles creation, embedding, and    │
   │    retrieval.                                │
   │ ④ Text document store: a basic fact store    │
   │    providing textual factual references.     │
   │                                              │
   │ All jury members choose exactly one of:      │
   │      • "no_objection"     (support)          │
   │      • "suspicious_fact"  (insufficient      │
   │                            evidence)         │
   │      • "reasonable_doubt" (counter-evidence) │
   │ Note: if a jury member fails or cannot       │
   │ provide a conclusion, it is counted as       │
   │ "abstains".                                  │
   └──────────────────────────────────────────────┘
                     │
                     ▼
       ┌───────────────────────────────────────┐
       │             3. Judge (Judge)          │
       ├───────────────────────────────────────┤
       │ • Aggregates jury votes               │
       │ • References similar precedents       │
       │ • Rule-based verdict logic; requires  │
       │   reaching a minimum quorum of valid  │
       │   votes                               │
       │       ▶ supported  (no objections)    │
       │       ▶ suspicious (some objections)  │
       │       ▶ refuted   (majority oppose)   │
       │ • Outputs Judge reasoning             │
       └───────────────────────────────────────┘
                     │
                     ▼
     ┌───────────────────────────────────────┐
     │      4. Court Generates CaseReport    │
     ├───────────────────────────────────────┤
     │ • Structured list of claims           │
     │ • Jury votes and rationales           │
     │ • Referenced precedents (if any)      │
     │ • Final judgment                      │
     │ • Persisted into precedent DB         │
     │   (SQL + Vector)                      │
     └───────────────────────────────────────┘

Full Example

You must configure the Prosecutor, Jury, and Judge before you can use Model Court. The setup is simple: specify which models to use and configure their APIs.
We recommend using OpenRouter, which allows you to access many LLMs with a single API key. The system also supports ChatGPT, Gemini, Claude, etc. See later sections for more details.

Note: the courtroom process must be run asynchronously.

import asyncio
import os
from pathlib import Path
from dotenv import load_dotenv  # Load .env for environment variables

from model_court import Court, Prosecutor, Jury, Judge
from model_court.code import SqliteCourtCode
from model_court.references import SimpleTextStorage, LocalRAGReference

# Load environment variables
load_dotenv()

# ----------------------------------------------------------------------
# 0. Preparation
# ----------------------------------------------------------------------
# Before running this demo, please make sure you have completed:
#
# 1. Environment configuration (.env)
#    - Create a .env file in the current directory
#    - Add API key, e.g.:
#      OPENROUTER_API_KEY=sk-or-v1-xxxx...
#
# 2. Virtual environment (recommended)
#    - python -m venv .venv
#    - source .venv/bin/activate  (Windows: .venv\Scripts\activate)
#
# 3. Install dependencies
#    - pip install model-court python-dotenv
#    - pip install model-court[full]  # Recommended if using RAG
#
# 4. Prepare data files (paths used in the code; below we use RAG jury
#    and text-based jury as examples)
#    Make sure the directory structure looks like:
#
#    .
#    ├── .env
#    ├── example_court.py (this file)
#    └── data/
#        ├── rag_init_files/           <-- initialization corpus for RAG jury
#        │   └── rumors_2024.txt       (any text files as knowledge base)
#        └── text_documents/           <-- reference files for text-based jury
#            └── basic_facts.txt       (basic factual text such as policies,
#                                      legal clauses, etc.)
# ----------------------------------------------------------------------


# ----------------------------------------------------------------------
# 1. Initialize Court: configure Prosecutor, Juries, and Judge
# ----------------------------------------------------------------------
def build_court() -> Court:
    # 1. Initialize precedent store (persistent storage)
    court_code = SqliteCourtCode(
        db_path="./fact_check_history.db",
        enable_vector_search=True
    )

    # 2. Initialize Prosecutor (check precedents and split claims)
    prosecutor = Prosecutor(
        court_code=court_code,
        auto_claim_splitting=False,  # Set True to split case into multiple claims
        model={
            "provider": "openai_compatible",
            "base_url": "https://openrouter.ai/api/v1",
            "api_key": os.getenv("OPENROUTER_API_KEY"),
            "model_name": "openai/gpt-3.5-turbo",
            "temperature": 0.1
        },
        prosecutor_prompt=(
            "You are a strict prosecutor. Break the input case into "
            "independent, verifiable factual claims."
        )
    )

    # 3. Initialize juries (ensure diversity to keep them independent)

    # [Logical perspective]
    jury_logic = Jury(
        name="Logic_Jury",
        model={
            "provider": "openai_compatible",
            "base_url": "https://openrouter.ai/api/v1",
            "api_key": os.getenv("OPENROUTER_API_KEY"),
            "model_name": "openai/gpt-4",
            "temperature": 0.0
        },
        reference=None,
        jury_prompt=(
            "Evaluate whether the statement is reasonable based on logic "
            "and common sense only. Do not fabricate information."
        )
    )

    # [Web search perspective]
    jury_web = Jury(
        name="Web_Jury",
        model={
            "provider": "openai_compatible",
            "base_url": "https://openrouter.ai/api/v1",
            "api_key": os.getenv("OPENROUTER_API_KEY"),
            "model_name": "perplexity/sonar",  # This model has built-in web access
            "temperature": 0.0
        },
        reference=None,
        jury_prompt=(
            "Use web search to verify each claim and base your judgment "
            "on the latest information."
        )
    )

    # [Local RAG perspective]
    jury_rag = Jury(
        name="RAG_Jury",
        model={
            "provider": "openai_compatible",
            "base_url": "https://openrouter.ai/api/v1",
            "api_key": os.getenv("OPENROUTER_API_KEY"),
            "model_name": "meta-llama/llama-3-70b-instruct",
            "temperature": 0.2
        },
        reference=LocalRAGReference(
            collection_name="common_rumors",
            persist_directory="./rag_db_storage",
            source_folder="./data/rag_init_files",
            embedding_model="MiniLM",
            mode="append",
            top_k=2
        ),
        jury_prompt="Query the local rumor knowledge base to see if related records exist."
    )

    # [Text document perspective]
    basic_facts_path = Path("./data/text_documents/basic_facts.txt")

    # Basic file check for demo convenience
    if not basic_facts_path.exists():
        raise FileNotFoundError(
            f"Demo failed: please create file {basic_facts_path} first."
        )

    jury_facts = Jury(
        name="Facts_Jury",
        model={
            "provider": "openai_compatible",
            "base_url": "https://openrouter.ai/api/v1",
            "api_key": os.getenv("OPENROUTER_API_KEY"),
            "model_name": "openai/gpt-3.5-turbo",
            "temperature": 0.1
        },
        reference=SimpleTextStorage(text=basic_facts_path.read_text(encoding="utf-8")),
        jury_prompt="Compare each claim against the basic facts text to decide if it is true."
    )

    # 4. Initialize Judge
    judge = Judge(
        model={
            "provider": "openai_compatible",
            "base_url": "https://openrouter.ai/api/v1",
            "api_key": os.getenv("OPENROUTER_API_KEY"),
            "model_name": "openai/gpt-4",
            "temperature": 0.2
        }
    )

    # 5. Assemble the Court
    return Court(
        prosecutor=prosecutor,
        juries=[jury_logic, jury_web, jury_rag, jury_facts],
        judge=judge,
        verdict_rules={
            "supported": {"operator": "eq", "value": 0},
            "suspicious": {"operator": "lt", "value": 0.5},
            "refuted": "default"
        },
        quorum=3,
        concurrency_limit=4
    )


# ----------------------------------------------------------------------
# 2. Run a demo
# ----------------------------------------------------------------------
async def demo():
    # Instantiate the court; RAG models will be loaded on first run
    court = build_court()

    # Case input
    case_text = "China and the United States have already had diplomatic relations for 300 years, and the two governments held a celebration for this."

    # Hear the case asynchronously
    report = await court.hear(case_text)

    # Display contents of the Report object
    print(f"
{'='*20} Case Report (ID: {report.case_id}) {'='*20}")

    for i, res in enumerate(report.claims, 1):
        print(f"
[Claim {i}] {res.claim.text}")

        # Print detailed jury votes
        for vote in res.jury_votes:
            print(f"  - {vote.jury_name}: {vote.decision}")
            if vote.reason:
                print(f"    Reason: {vote.reason[:60]}...")

        print(f"
  => Judge Verdict: [{res.verdict}]")
        print(f"  => Judge Reasoning: {res.judge_reasoning}")

    print(f"
{'='*60}")


if __name__ == "__main__":
    # The court process must be run asynchronously
    asyncio.run(demo())

More examples can be found under the example folder in the project:

Full CLI example – Command-line script demonstrating all major features (similar to the example above).
Web app example – A fact-checking web application that shows how to integrate Model Court into a web UI.

Project Configuration

LLM

The project supports the following LLM providers:

Provider	Description	Example Models
`openai`	Native OpenAI API	gpt-4, gpt-3.5-turbo
`google`	Google Gemini	gemini-pro, gemini-1.5-pro
`anthropic`	Anthropic Claude	claude-3-5-sonnet, claude-3-opus
`openai_compatible`	OpenAI-compatible API (recommended)	Access all models via OpenRouter
`custom`	Custom provider	Local models or self-hosted service

Recommended: openai_compatible + OpenRouter

OpenRouter provides a unified interface to many LLMs. With a single API key, you can access over 100 models, including some that are free (e.g., deepseek).

# Environment variable
export OPENROUTER_API_KEY="sk-or-v1-..."

# In code
model_config = {
    "provider": "openai_compatible",
    "base_url": "https://openrouter.ai/api/v1",
    "api_key": os.getenv("OPENROUTER_API_KEY"),
    "model_name": "openai/gpt-4",  # Or any other supported model
}

Supported models list: https://openrouter.ai/models

Reference

The project supports the following built-in reference sources and modes:

Reference Type	Description	Typical Use Case
`SimpleTextStorage`	Plain text docs	Simple fact lists, rule descriptions
`LocalRAGReference`	Local RAG KB	Semantic search over large corpora
`GoogleSearchReference`	Google Custom Search	Need real-time web verification
`None`	Blind mode	Pure logical reasoning without external sources

1. Simple text storage

from model_court.references import SimpleTextStorage
from pathlib import Path

# Read from file
facts_file = Path("./data/rag_documents/basic_facts.txt")
with open(facts_file, "r", encoding="utf-8") as f:
    facts_text = f.read()

reference = SimpleTextStorage(text=facts_text)

# Or directly pass a small text block (for quick tests)
# reference = SimpleTextStorage(text="Fact 1: The Earth is round
Fact 2: The chemical formula of water is H2O")

2. Local RAG knowledge base

from model_court.references import LocalRAGReference

reference = LocalRAGReference(
    collection_name="my_knowledge",
    persist_directory="./vector_db",
    source_folder="./documents",  # Folder with txt/md files
    embedding_model="MiniLM",     # "MiniLM", "BGE", or "OpenAI"
    mode="append",                # "overwrite", "append", or "read_only"
    top_k=3                       # Return top 3 most relevant chunks
)

3. Google Search

from model_court.references import GoogleSearchReference

reference = GoogleSearchReference(
    api_key="your-google-api-key",
    search_engine_id="your-search-engine-id",
    num_results=5
)

4. Blind mode (no reference)

jury = Jury(
    name="Logic_Checker",
    model=model_config,
    reference=None,  # No external references
    jury_prompt="Judge only based on logic and common sense."
)

Project Structure

model_court/
├── model_court/             # Core package
│   ├── core/                # Core components
│   │   ├── models.py        # Data models
│   │   ├── court.py         # Court main class
│   │   ├── prosecutor.py    # Prosecutor class
│   │   ├── jury.py          # Jury class
│   │   └── judge.py         # Judge class
│   ├── llm/                 # LLM provider layer
│   │   ├── base.py          # Abstract base classes
│   │   ├── openai_provider.py
│   │   ├── google_provider.py
│   │   ├── anthropic_provider.py
│   │   ├── custom_provider.py
│   │   └── factory.py       # Provider factory
│   ├── references/          # Reference sources
│   │   ├── base.py          # Abstract base classes
│   │   ├── google_search.py
│   │   ├── web_search.py
│   │   ├── rag_reference.py
│   │   └── text_storage.py
│   ├── embeddings/          # Embedding models
│   │   ├── base.py          # Abstract base classes
│   │   ├── minilm.py
│   │   ├── bge.py
│   │   └── openai_embedding.py
│   ├── code/                # Court Code (precedent store)
│   │   ├── base.py          # Abstract base classes
│   │   └── sqlite_code.py
│   └── utils/               # Helper utilities
│       └── helpers.py
├── example/                 # Usage examples
│   ├── example_full.py      # Full CLI example
│   ├── backend/             # Web API server
│   ├── frontend/            # Web frontend
│   └── data/                # Example data
├── api_docs.md              # API documentation
├── README.md                # Project description
├── CHANGELOG.md             # Changelog
├── CONTRIBUTING.md          # Contribution guide
├── LICENSE                  # License
├── pyproject.toml           # Project configuration
├── setup.py                 # Setup script
└── requirements.txt         # Dependencies

Advanced Features

Custom Verdict Rules

You can customize verdict rules according to your business requirements:

# Example 1: Strict mode (single veto)
court_strict = Court(
    prosecutor=prosecutor,
    juries=[jury_logic, jury_web, jury_rag, jury_facts],
    judge=judge,
    verdict_rules={
        "supported": {"operator": "eq", "value": 0},   # Must have 0 opposing votes
        "refuted": "default"  # Any opposing vote → refuted
    }
)

# Example 2: Lenient mode (majority rule)
court_lenient = Court(
    prosecutor=prosecutor,
    juries=[jury_logic, jury_web, jury_rag, jury_facts],
    judge=judge,
    verdict_rules={
        "supported": {"operator": "lt", "value": 0.25},   # Opposition < 25%
        "suspicious": {"operator": "lt", "value": 0.75},  # Opposition < 75%
        "refuted": "default"  # Opposition >= 75%
    }
)

# Example 3: Multi-level rating
court_detailed = Court(
    prosecutor=prosecutor,
    juries=[jury_logic, jury_web, jury_rag, jury_facts],
    judge=judge,
    verdict_rules={
        "clearly_true": {"operator": "eq", "value": 0},     # 0 opposition
        "likely_true": {"operator": "lt", "value": 0.3},    # < 30% opposition
        "uncertain": {"operator": "lt", "value": 0.6},      # < 60% opposition
        "likely_false": {"operator": "lt", "value": 0.9},   # < 90% opposition
        "clearly_false": "default"  # >= 90% opposition
    }
)

Automatic Claim Splitting

For complex statements, you can automatically split them into multiple independent claims:

prosecutor = Prosecutor(
    court_code=court_code,
    auto_claim_splitting=True,  # Enable auto splitting
    model={
        "provider": "openai_compatible",
        "base_url": "https://openrouter.ai/api/v1",
        "api_key": os.getenv("OPENROUTER_API_KEY"),
        "model_name": "openai/gpt-3.5-turbo",
    },
    prosecutor_prompt="Split the case into independent, verifiable factual claims."
)

# Input: "The Earth is flat, and the Sun orbits the Earth."
# Automatically split into:
# Claim 1: "The Earth is flat."
# Claim 2: "The Sun orbits the Earth."

Precedent Caching System

Automatically cache past rulings to avoid repeated evaluation:

from datetime import timedelta

court_code = SqliteCourtCode(
    db_path="./court_history.db",
    enable_vector_search=True,              # Vector search for similar cases
    default_validity_period=timedelta(days=30)  # Precedent validity period
)

# First check: full pipeline, typically 10–30 seconds
report1 = await court.hear("The Earth is flat.")

# Second check with same content: directly return cached result, < 1 second
report2 = await court.hear("The Earth is flat.")

FAQ

Q: Why are the package name and import name different?

This is intentional:

Installation: pip install model-court (PyPI package name, with hyphen)
Import: from model_court import ... (Python module name, with underscore)

This is a common pattern in Python because module names cannot contain hyphens.

Q: I get `ModuleNotFoundError: No module named 'model_court'`

Please ensure the package is installed correctly:

# From project root (where pyproject.toml is located)
pip install -e .

# Or install from PyPI
pip install model-court

Q: How do I use different LLMs?

Recommended: use OpenRouter as a unified entrypoint:

model_config = {
    "provider": "openai_compatible",
    "base_url": "https://openrouter.ai/api/v1",
    "api_key": os.getenv("OPENROUTER_API_KEY"),
    "model_name": "MODEL_NAME",  # e.g., openai/gpt-4, anthropic/claude-3-5-sonnet
}

Supported model list: https://openrouter.ai/models

You can of course also use the official APIs for ChatGPT, Gemini, Claude, or school/corporate APIs that are OpenAI-compatible.

Q: How can I reduce API costs?

Suggestions:

Use cheaper or free APIs when possible.
Use smaller or local models (local inference is supported).
Use the precedent caching system to avoid repeated evaluation.
Reduce the number of juries.
Use cheaper models such as gpt-3.5-turbo.
Disable automatic claim splitting (auto_claim_splitting=False).

Q: What if evaluation is slow?

Normally, evaluating multiple models in parallel takes about 10–30 seconds. To speed up:

Enable and leverage precedent caching (second run on the same content is < 1 second).
Reduce the number of juries.
Choose faster models.
Tune the concurrency_limit parameter.

License & Citation

This project is licensed under the MIT License and can be used freely, including for commercial purposes.

If you use Model Court in your research, please cite:

@software{model-court,
  title={Model Court: A Multi-Model Ensemble Framework for Verification},
  author={Jeff Liu},
  year={2025},
  url={https://github.com/LogicGate-AI-Lab/model-court}
}

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.0.2

Nov 30, 2025

0.0.1

Nov 30, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

model_court-0.0.2.tar.gz (58.3 kB view details)

Uploaded Nov 30, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

model_court-0.0.2-py3-none-any.whl (53.6 kB view details)

Uploaded Nov 30, 2025 Python 3

File details

Details for the file model_court-0.0.2.tar.gz.

File metadata

Download URL: model_court-0.0.2.tar.gz
Upload date: Nov 30, 2025
Size: 58.3 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for model_court-0.0.2.tar.gz
Algorithm	Hash digest
SHA256	`a4fef0cf4a77dc8477124e859ee4ea4207338ca9b2667fe0a2a275d52db9f301`
MD5	`8ab1b501dad8ac6935fe1c3fd0de0c48`
BLAKE2b-256	`dea02708cec43524250eb459dc0d65efba8a2773ec63825c0d6d8d7bff4b4438`

See more details on using hashes here.

File details

Details for the file model_court-0.0.2-py3-none-any.whl.

File metadata

Download URL: model_court-0.0.2-py3-none-any.whl
Upload date: Nov 30, 2025
Size: 53.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.13.7

File hashes

Hashes for model_court-0.0.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`d4e2204e67fe2eaa9c55790c224421127f91f1d8467e1b8dbaec60d4977b89cb`
MD5	`eb68d6e47574b4e01cab4efec35d16fa`
BLAKE2b-256	`83c09c57396e44dcd1e6c316c45878280b80a063a4c5e791afc852e5234323bc`

See more details on using hashes here.

model-court 0.0.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Model Court: A Multi-Model Ensemble Framework for Verification

Project Overview (V0.0.2)

Installation

Detailed Introduction

Full Courtroom Workflow

Full Example

Project Configuration

LLM

Reference

Project Structure

Advanced Features

Custom Verdict Rules

Automatic Claim Splitting

Precedent Caching System

FAQ

Q: Why are the package name and import name different?

Q: I get ModuleNotFoundError: No module named 'model_court'

Q: How do I use different LLMs?

Q: How can I reduce API costs?

Q: What if evaluation is slow?

License & Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

Q: I get `ModuleNotFoundError: No module named 'model_court'`