Concise Logic and Explanation Analysis Reports (CLEAR) for ML models

These details have not been verified by PyPI

Project links

Project description

CLEAR: Concise Logic and Explanation Analysis Reports

CLEAR is a Python tool for generating model cards and risk reports from machine learning evaluation outputs. It transforms structured evaluation data into readable, standardized documentation that can be shared with stakeholders.

What This Tool Does

CLEAR takes ML evaluation metrics and metadata as input and generates:

Model Cards: Standardized documentation including model overview, intended use, dataset summaries, performance metrics, limitations, and ethical considerations.
Risk Reports: Analysis of identified risks, their mitigation strategies, and severity levels.
Markdown Output: Human-readable markdown files suitable for documentation repositories or sharing with teams.

The tool processes JSON or YAML input files and applies templating to produce consistent, well-structured reports.

What It Does NOT Do

CLEAR does not:

Automatically compute ML metrics. You must provide pre-calculated evaluation results.
Generate visualizations or charts. Output is text-based markdown only.
Store or manage model artifacts, checkpoints, or weights.
Connect to external APIs, cloud services, or model registries.
Enforce particular ML frameworks or tooling choices.
Make judgment calls about model safety or regulatory compliance. It documents what you provide.

Installation

Install from PyPI:

pip install modelcardgen

Or install from source with development dependencies:

git clone https://github.com/ghostcipher1/modelcardgen.git
cd modelcardgen
pip install -e ".[dev]"

Requires Python 3.10 or later.

CLI Usage Examples

Generate a model card

modelcardgen generate --metrics evaluation.json --output-dir .

Using YAML input

modelcardgen generate --metrics metrics.yaml --output-dir ./reports

Validate metrics file

modelcardgen validate --metrics evaluation.json

View help

modelcardgen --help
modelcardgen generate --help

Input Data Schema

The tool expects JSON or YAML input files containing model metadata, dataset information, evaluation metrics, and risk assessments. Below is the complete input schema specification.

Required Top-Level Fields

model_name: string              # Name of the model
model_version: string           # Semantic version (e.g., "1.0.0")
model_description: string       # High-level overview
model_owner: string             # Person or team responsible
model_license: string           # License type (e.g., "Apache-2.0")
model_framework: string         # ML framework used (e.g., "scikit-learn")

accuracy: float                 # 0.0 to 1.0
precision: float                # 0.0 to 1.0
recall: float                   # 0.0 to 1.0
f1_score: float                 # 0.0 to 1.0

Optional Fields

model_release_date: YYYY-MM-DD  # Model release date (defaults to today)

roc_auc: float                  # 0.0 to 1.0 (optional)
confusion_matrix: [[int]]       # 2D array of prediction counts (optional)
custom_metrics: {}              # Dictionary of domain-specific metrics (optional)

training_data_name: string
training_data_description: string
training_data_size: integer     # Number of samples
training_data_features: [string]  # List of feature names
training_data_target: string    # Target variable name
training_data_source_url: url   # Optional URL to dataset source

eval_data_name: string
eval_data_description: string
eval_data_size: integer
eval_data_features: [string]
eval_data_target: string
eval_data_source_url: url

unsuitable_inputs: [string]     # List of input types where model fails
environmental_constraints: string  # Hardware/software requirements
out_of_scope_uses: [string]     # Scenarios to avoid

intended_users: [string]        # Target audience personas
intended_use_cases: [string]    # Specific tasks designed for
prohibited_uses: [string]       # Forbidden uses (ethical/legal)

Risks Array (Optional)

risks:
  - risk_type: string           # Category (e.g., "Data Bias")
    description: string         # Detailed explanation
    mitigation_strategy: string # Mitigation approach
    severity: string            # "Low", "Medium", or "High"

Complete JSON Example

{
  "model_name": "Email Spam Classifier",
  "model_version": "2.1.0",
  "model_description": "Classifies emails as spam or legitimate",
  "model_owner": "ML Team",
  "model_license": "Apache-2.0",
  "model_framework": "scikit-learn",
  "accuracy": 0.963,
  "precision": 0.951,
  "recall": 0.945,
  "f1_score": 0.948,
  "roc_auc": 0.985,
  "training_data_name": "Enron Email Corpus",
  "training_data_description": "Real email messages with labels",
  "training_data_size": 755000,
  "training_data_features": ["subject_line", "body_text"],
  "training_data_target": "spam_label",
  "eval_data_name": "Recent Email Dataset",
  "eval_data_description": "Holdout test set",
  "eval_data_size": 50000,
  "eval_data_features": ["subject_line", "body_text"],
  "eval_data_target": "spam_label",
  "unsuitable_inputs": ["Non-English emails", "Encrypted content"],
  "out_of_scope_uses": ["Real-time filtering without review"],
  "intended_users": ["Email administrators", "IT security teams"],
  "intended_use_cases": ["Spam detection"],
  "prohibited_uses": ["Discriminatory filtering"],
  "risks": [
    {
      "risk_type": "Data Distribution Shift",
      "description": "Production data may differ from training",
      "mitigation_strategy": "Monitor metrics in production",
      "severity": "Medium"
    }
  ]
}

Complete YAML Example

model_name: Email Spam Classifier
model_version: 2.1.0
model_description: Classifies emails as spam or legitimate
model_owner: ML Team
model_license: Apache-2.0
model_framework: scikit-learn

accuracy: 0.963
precision: 0.951
recall: 0.945
f1_score: 0.948
roc_auc: 0.985

training_data_name: Enron Email Corpus
training_data_description: Real email messages with labels
training_data_size: 755000
training_data_features:
  - subject_line
  - body_text
training_data_target: spam_label

eval_data_name: Recent Email Dataset
eval_data_description: Holdout test set
eval_data_size: 50000
eval_data_features:
  - subject_line
  - body_text
eval_data_target: spam_label

unsuitable_inputs:
  - Non-English emails
  - Encrypted content
out_of_scope_uses:
  - Real-time filtering without review

intended_users:
  - Email administrators
  - IT security teams
intended_use_cases:
  - Spam detection
prohibited_uses:
  - Discriminatory filtering

risks:
  - risk_type: Data Distribution Shift
    description: Production data may differ from training
    mitigation_strategy: Monitor metrics in production
    severity: Medium

Validation Rules

Metrics values (accuracy, precision, recall, f1_score, roc_auc) must be between 0.0 and 1.0
All model_ fields* are required
All training_data_ and eval_data_ fields are required** except source_url (optional)
All metrics fields (accuracy, precision, recall, f1_score) are required; roc_auc is optional
Lists (features, unsuitable_inputs, etc.) can be empty but must be arrays
Risks is optional; if provided, each risk must have all four fields

Common Errors

Error	Solution
`Invalid JSON`	Check file syntax using `jq` or a JSON validator
`Invalid YAML`	Check indentation (use spaces, not tabs); use a YAML linter
`Validation failed: accuracy`	Ensure metric values are between 0.0 and 1.0
`File not found`	Verify the file path and ensure the file exists
`Missing required field`	Check that all required model_* and eval_* fields are present

Python API Example

Use CLEAR as a library in your code:

from modelcardgen.core.models import (
    ModelMetadata,
    DatasetMetadata,
    EvaluationMetrics,
    RiskAssessment,
)
from modelcardgen.reports.markdown import MarkdownCardGenerator

metadata = ModelMetadata(
    name="My Classifier",
    version="1.0.0",
    description="Classifies text documents.",
    owner="ML Team",
    license="Apache-2.0",
    framework="scikit-learn"
)

metrics = EvaluationMetrics(
    accuracy=0.92,
    precision=0.90,
    recall=0.94,
    f1_score=0.92,
    roc_auc=0.96
)

training_data = DatasetMetadata(
    name="Training Set",
    description="Internal labeled dataset",
    size=10000,
    features=["text_features"],
    target="label"
)

risks = [
    RiskAssessment(
        risk_type="Data Distribution Shift",
        description="Production data may differ from training distribution.",
        mitigation_strategy="Monitor performance metrics in production.",
        severity="Medium"
    )
]

generator = MarkdownCardGenerator()
generator.generate(
    metadata=metadata,
    metrics=metrics,
    training_data=training_data,
    risks=risks,
    output_path="MODEL_CARD.md"
)

CI/CD Usage Example

Integrate model card generation into your CI/CD pipeline:

# Example GitHub Actions workflow
name: Generate Model Card
on:
  push:
    paths:
      - 'model/evaluation_results.json'

jobs:
  generate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: actions/setup-python@v4
        with:
          python-version: '3.10'
      - run: pip install modelcardgen
      - run: |
          modelcardgen generate \
            --input model/evaluation_results.json \
            --output docs/MODEL_CARD.md
      - run: git add docs/MODEL_CARD.md && git commit -m "Update model card"
        if: ${{ github.event_name == 'push' }}

Design Philosophy

CLEAR follows these principles:

Offline First: No external API calls or cloud dependencies. Everything runs locally.
Data Driven: Accuracy depends on the quality of input data. Garbage in, garbage out.
Template Based: Uses Jinja2 templating for flexibility. Customize output by modifying templates.
No Magic: Explicit over implicit. The tool documents what you tell it; it doesn't infer or assume.
Minimal Dependencies: Relies on standard, well-maintained Python libraries (Jinja2, Pydantic, Pandas).
Language Agnostic: Works with any ML framework or language, as long as you can generate JSON/YAML evaluation output.

License

Licensed under the Apache License 2.0. See LICENSE file for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Jan 2, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

modelcardgen-0.1.0.tar.gz (39.3 kB view details)

Uploaded Jan 2, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

modelcardgen-0.1.0-py3-none-any.whl (30.1 kB view details)

Uploaded Jan 2, 2026 Python 3

File details

Details for the file modelcardgen-0.1.0.tar.gz.

File metadata

Download URL: modelcardgen-0.1.0.tar.gz
Upload date: Jan 2, 2026
Size: 39.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for modelcardgen-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`35c395768cb4c94ee1c1aefcf5f3aa01785cba0e175d62b6993516ad3dd778bc`
MD5	`22f33bb04066c01b27254147d3b61109`
BLAKE2b-256	`d20949cf1291b7ae571b4977de9c2fc5d973eaa29e1b82e09e13859caaeb7551`

See more details on using hashes here.

Provenance

The following attestation bundles were made for modelcardgen-0.1.0.tar.gz:

Publisher: test-and-publish.yml on ghostcipher1/modelcardgen

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: modelcardgen-0.1.0.tar.gz
- Subject digest: 35c395768cb4c94ee1c1aefcf5f3aa01785cba0e175d62b6993516ad3dd778bc
- Sigstore transparency entry: 788373959
- Sigstore integration time: Jan 2, 2026
Source repository:
- Permalink: ghostcipher1/modelcardgen@252425fd0df60f489744c7a639fa9cbafa938fc9
- Branch / Tag: refs/heads/main
- Owner: https://github.com/ghostcipher1
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: test-and-publish.yml@252425fd0df60f489744c7a639fa9cbafa938fc9
- Trigger Event: workflow_dispatch

File details

Details for the file modelcardgen-0.1.0-py3-none-any.whl.

File metadata

Download URL: modelcardgen-0.1.0-py3-none-any.whl
Upload date: Jan 2, 2026
Size: 30.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for modelcardgen-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4a222782ac1042c80e4543eca57e725978a658b7c74b69f01e7323e92c332ed5`
MD5	`06edaa98d9636d1a0cb97618dc042210`
BLAKE2b-256	`cad8194bf9e85300b243188ab4caf6a0129b21c51f2b783f3f0ed11cb8a95ecf`

See more details on using hashes here.

Provenance

The following attestation bundles were made for modelcardgen-0.1.0-py3-none-any.whl:

Publisher: test-and-publish.yml on ghostcipher1/modelcardgen

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: modelcardgen-0.1.0-py3-none-any.whl
- Subject digest: 4a222782ac1042c80e4543eca57e725978a658b7c74b69f01e7323e92c332ed5
- Sigstore transparency entry: 788373960
- Sigstore integration time: Jan 2, 2026
Source repository:
- Permalink: ghostcipher1/modelcardgen@252425fd0df60f489744c7a639fa9cbafa938fc9
- Branch / Tag: refs/heads/main
- Owner: https://github.com/ghostcipher1
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: test-and-publish.yml@252425fd0df60f489744c7a639fa9cbafa938fc9
- Trigger Event: workflow_dispatch

modelcardgen 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

CLEAR: Concise Logic and Explanation Analysis Reports

What This Tool Does

What It Does NOT Do

Installation

CLI Usage Examples

Generate a model card

Using YAML input

Validate metrics file

View help

Input Data Schema

Required Top-Level Fields

Optional Fields

Risks Array (Optional)

Complete JSON Example

Complete YAML Example

Validation Rules

Common Errors

Python API Example

CI/CD Usage Example

Design Philosophy

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance