Uncertainty estimation for open-source generative models

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

Project description

Klarity

Toolkit for LLM behavior analysis & uncertainty mitigation

🐳 Now with reasoning model support to analyse CoTs entropy and improve RL datasets

🎯 Overview

Klarity is a toolkit for inspecting and debugging AI decision-making processes. By combining uncertainty analysis with reasoning insights, it helps you understand how models think and fix issues before they reach production.

Dual Entropy Analysis: Measure model confidence through raw entropy and semantic similarity metrics
Reasoning Analysis: Extract and evaluate step-by-step thinking patterns in model outputs
Semantic Clustering: Group similar predictions to reveal decision-making pathways
Structured Insights: Get detailed JSON analysis of both uncertainty patterns and reasoning steps
AI-powered Report: Leverage capable models to interpret generation patterns and provide human-readable insights

Reasoning Analysis Example - Understanding model's step-by-step thinking process

Entropy Analysis Example - Analyzing token-level uncertainty patterns

🚀 Quick Start Hugging Face

Install directly from GitHub:

pip install git+https://github.com/klara-research/klarity.git

📝 Reasoning LLM Usage Example

For insights and uncertainty analytics into model reasoning patterns, you can use the ReasoningAnalyzer:

from klarity.core.analyzer import ReasoningAnalyzer

# Create estimator with reasoning analyzer
estimator = UncertaintyEstimator(
    top_k=100,
    analyzer=ReasoningAnalyzer(
        min_token_prob=0.01,
        insight_model="together:meta-llama/Llama-3.3-70B-Instruct-Turbo",
        insight_api_key="your_api_key",
        reasoning_start_token="<think>", # You can change this if you have different reasoning tokens
        reasoning_end_token="</think>"   
    )
)

# Generate with reasoning analysis
prompt = "Your prompt <think>"
inputs = tokenizer(prompt, return_tensors="pt")

generation_output = model.generate(
    **inputs,
    max_new_tokens=200,
    temperature=0.6,
    logits_processor=LogitsProcessorList([uncertainty_processor]),
    return_dict_in_generate=True,
    output_scores=True,
)

result = estimator.analyze_generation(
    generation_output,
    tokenizer,
    uncertainty_processor,
    prompt
)

# Print reasoning analysis
print("\nReasoning Analysis:")
if result.overall_insight and "reasoning_analysis" in result.overall_insight:
    analysis = result.overall_insight["reasoning_analysis"]
    for step in analysis["steps"]:
        print(f"\nStep {step['step_number']}:")
        print(f"Content: {step['step_info']['content']}")
        
        if 'analysis' in step:
            step_analysis = step['analysis']['training_insights']
            print("\nQuality Metrics:")
            for metric, score in step_analysis['step_quality'].items():
                print(f"  {metric}: {score}")

📝 Standard LLM Usage Example

To prevent most of common uncertainty scenarios and route to better models you can use our EntropyAnalyzer

from transformers import AutoModelForCausalLM, AutoTokenizer, LogitsProcessorList
from klarity import UncertaintyEstimator
from klarity.core.analyzer import EntropyAnalyzer

# Initialize your model
model_name = "Qwen/Qwen2.5-7B-Instruct"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Create estimator
estimator = UncertaintyEstimator(
    top_k=100,
    analyzer=EntropyAnalyzer(
        min_token_prob=0.01,
        insight_model=model,
        insight_tokenizer=tokenizer
    )
)

uncertainty_processor = estimator.get_logits_processor()

# Set up generation
prompt = "Your prompt"
inputs = tokenizer(prompt, return_tensors="pt")

# Generate with uncertainty analysis
generation_output = model.generate(
    **inputs,
    max_new_tokens=20,
    temperature=0.7,
    top_p=0.9,
    logits_processor=LogitsProcessorList([uncertainty_processor]),
    return_dict_in_generate=True,
    output_scores=True,
)

# Analyze the generation
result = estimator.analyze_generation(
    generation_output,
    tokenizer,
    uncertainty_processor
)

generated_text = tokenizer.decode(generation_output.sequences[0], skip_special_tokens=True)

# Inspect results
print(f"\nPrompt: {prompt}")
print(f"Generated text: {generated_text}")

print("\nDetailed Token Analysis:")
for idx, metrics in enumerate(result.token_metrics):
    print(f"\nStep {idx}:")
    print(f"Raw entropy: {metrics.raw_entropy:.4f}")
    print(f"Semantic entropy: {metrics.semantic_entropy:.4f}")
    print("Top 3 predictions:")
    for i, pred in enumerate(metrics.token_predictions[:3], 1):
        print(f"  {i}. {pred.token} (prob: {pred.probability:.4f})")

# Show comprehensive insight
print("\nComprehensive Analysis:")
print(result.overall_insight)

📊 Analysis Output

Klarity provides two types of analysis output:

Reasoning Analysis

You'll get detailed insights into the model's reasoning process:

{
    "reasoning_analysis": {
        "steps": [
            {
                "step_number": 1,
                "step_info": {
                    "content": "Step reasoning content",
                    "type": "analysis"
                },
                "analysis": {
                    "training_insights": {
                        "step_quality": {
                            "coherence": "0.8",
                            "relevance": "0.9",
                            "confidence": "0.7"
                        },
                        "improvement_targets": [
                            {
                                "aspect": "conciseness",
                                "importance": "0.8",
                                "current_issue": "verbose response",
                                "training_suggestion": "reduce explanation steps"
                            }
                        ]
                    }
                }
            }
        ]
    }
}

Entropy Analysis

For standard language models you will get a general uncertainty report:

{
    "scores": {
        "overall_uncertainty": "<0-1>",  
        "confidence_score": "<0-1>",     
        "hallucination_risk": "<0-1>"    
    },
    "uncertainty_analysis": {
        "high_uncertainty_parts": [
            {
                "text": "",
                "why": ""
            }
        ],
        "main_issues": [
            {
                "issue": "",
                "evidence": ""
            }
        ],
        "key_suggestions": [
            {
                "what": "",
                "how": ""
            }
        ]
    }
}

🤖 Supported Frameworks & Models

Model Frameworks

Currently supported:

✅ Hugging Face Transformers -> Full uncertainty analysis with raw and semantic entropy metrics
✅ Together AI -> Uncertainty analysis with raw log prob. metrics

Planned support:

⏳ PyTorch

Analysis Model (for the insights) Frameworks

Currently supported:

✅ Hugging Face Transformers
✅ Together AI API

Planned support:

⏳ PyTorch

Tested Target Models

Model	Type	Status	Notes
Qwen2.5-0.5B	Base	✅ Tested	Full Support
Qwen2.5-0.5B-Instruct	Instruct	✅ Tested	Full Support
Qwen2.5-7B	Base	✅ Tested	Full Support
Qwen2.5-7B-Instruct	Instruct	✅ Tested	Full Support
Llama-3.2-3B-Instruct	Instruct	✅ Tested	Full Support
DeepSeek-R1-Distill-Qwen-1.5B	Reasoning	✅ Tested	Together API Insights
DeepSeek-R1-Distill-Qwen-7B	Reasoning	✅ Tested	Together API Insights

Analysis Models

Model	Type	Status	JSON Reliability	Notes
Qwen2.5-0.5B-Instruct	Instruct	✅ Tested	⚡ Low	Consistently output unstructured analysis instead of JSON. Best used with structured prompting and validation.
Qwen2.5-7B-Instruct	Instruct	✅ Tested	⚠️ Moderate	Sometimes outputs well-formed JSON analysis.
Llama-3.3-70B-Instruct-Turbo	Instruct	✅ Tested	✅ High	Reliably outputs well-formed JSON analysis. Recommended for production use.

JSON Output Reliability Guide:

✅ High: Consistently outputs valid JSON (>80% of responses)
⚠️ Moderate: Usually outputs valid JSON (50-80% of responses)
⚡ Low: Inconsistent JSON output (<50% of responses)

🔍 Advanced Features

Custom Analysis Configuration

You can customize the analysis parameters:

analyzer = EntropyAnalyzer(
    min_token_prob=0.01,  # Minimum probability threshold
    semantic_similarity_threshold=0.8  # Threshold for semantic grouping
)

🤝 Contributing

Contributions are welcome! Areas we're looking to improve:

Additional framework support
More tested models
Enhanced semantic analysis
Additional analysis metrics
Documentation and examples

Please see our Contributing Guide for details.

📝 License

Apache 2.0 License. See LICENSE for more information.

📫 Community & Support

Website
Discord Community for discussions & chatting
GitHub Issues for bugs and features

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

davide221

Release history Release notifications | RSS feed

This version

0.1.0

Feb 9, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

klarity-0.1.0.tar.gz (19.7 kB view details)

Uploaded Feb 9, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

klarity-0.1.0-py3-none-any.whl (16.8 kB view details)

Uploaded Feb 9, 2025 Python 3

File details

Details for the file klarity-0.1.0.tar.gz.

File metadata

Download URL: klarity-0.1.0.tar.gz
Upload date: Feb 9, 2025
Size: 19.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for klarity-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`7bdb18fccd602b44a5c554c92bf75ce7f49cee8a23262efe540a53bcdb799ed2`
MD5	`26324aa7016afc260a01372329564d7f`
BLAKE2b-256	`0ced32770e090011edfa5df7f3f85d4e16b049bceda3d85f2d1ead52e9445b5a`

See more details on using hashes here.

Provenance

The following attestation bundles were made for klarity-0.1.0.tar.gz:

Publisher: python-publish.yml on klara-research/klarity

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: klarity-0.1.0.tar.gz
- Subject digest: 7bdb18fccd602b44a5c554c92bf75ce7f49cee8a23262efe540a53bcdb799ed2
- Sigstore transparency entry: 169865407
- Sigstore integration time: Feb 9, 2025
Source repository:
- Permalink: klara-research/klarity@a80fbfdc4fa161c28ab61b44207782d929ae641c
- Branch / Tag: refs/tags/v0.1
- Owner: https://github.com/klara-research
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@a80fbfdc4fa161c28ab61b44207782d929ae641c
- Trigger Event: release

File details

Details for the file klarity-0.1.0-py3-none-any.whl.

File metadata

Download URL: klarity-0.1.0-py3-none-any.whl
Upload date: Feb 9, 2025
Size: 16.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for klarity-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`53fc28ed14f5010577394b25853f7aef89d9f72fce619277f1b9506372c6d600`
MD5	`ac1a73ba73e63cbcb24fc78c7e8c7dea`
BLAKE2b-256	`090c30167498242f519ebe9d6f6b4353278689a93589d689972d6ae734d892c8`

See more details on using hashes here.

Provenance

The following attestation bundles were made for klarity-0.1.0-py3-none-any.whl:

Publisher: python-publish.yml on klara-research/klarity

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: klarity-0.1.0-py3-none-any.whl
- Subject digest: 53fc28ed14f5010577394b25853f7aef89d9f72fce619277f1b9506372c6d600
- Sigstore transparency entry: 169865408
- Sigstore integration time: Feb 9, 2025
Source repository:
- Permalink: klara-research/klarity@a80fbfdc4fa161c28ab61b44207782d929ae641c
- Branch / Tag: refs/tags/v0.1
- Owner: https://github.com/klara-research
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: python-publish.yml@a80fbfdc4fa161c28ab61b44207782d929ae641c
- Trigger Event: release

klarity 0.1.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Project description

Klarity

🎯 Overview

🚀 Quick Start Hugging Face

📝 Reasoning LLM Usage Example

📝 Standard LLM Usage Example

📊 Analysis Output

Reasoning Analysis

Entropy Analysis

🤖 Supported Frameworks & Models

Model Frameworks

Analysis Model (for the insights) Frameworks

Tested Target Models

Analysis Models

JSON Output Reliability Guide:

🔍 Advanced Features

Custom Analysis Configuration

🤝 Contributing

📝 License

📫 Community & Support

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance