Advanced relevance scoring for LLM outputs

These details have not been verified by PyPI

Project links

Project description

LlamaCalc

LlamaCalc is an advanced relevance scoring tool for evaluating how well answers address questions, specifically designed for assessing LLM-generated responses.

![LlamaCalc Demo](https://llamasearch.ai

Features

Multi-Factor Scoring: Evaluates answer relevance based on proximity, concept coverage, conciseness, and logical flow
MLX Acceleration: Optimized for Apple Silicon using the MLX framework, with fallback to NumPy
Command-Line Interface: Interactive and colorful CLI powered by Rich
Batch Processing: Evaluate multiple question-answer pairs at once
Caching: In-memory and persistent disk caching for faster repeated evaluations
Customizable Weights: Fine-tune the importance of different scoring components
Python API: Easy integration into your Python applications

Installation

# Basic installation
pip install llamacalc

# With interactive UI features
pip install llamacalc[ui]

# With MLX acceleration (for Apple Silicon)
pip install llamacalc[mlx]

# With development dependencies
pip install llamacalc[dev]

Quick Start

Command Line

# Simple usage
llamacalc --question "What is Python?" --answer "Python is a high-level programming language known for its readability and versatility."

# Interactive mode
llamacalc --interactive

# Batch processing
llamacalc --batch-file qa_pairs.json --json --output-file results.json

# Custom weights
llamacalc -q "What is Python?" -a "Python is a programming language." --proximity-weight 0.4 --coverage-weight 0.3 --conciseness-weight 0.1 --logical-flow-weight 0.2

Python API

from llamacalc import calculate_relevance_score

# Calculate a score
result = calculate_relevance_score(
    question="What is Python?",
    answer="Python is a versatile programming language used in web development, data science, and AI."
)

# Display results
print(f"Total Score: {result.total_score:.2f}")
print(f"Proximity Score: {result.proximity_score:.2f}")
print(f"Coverage Score: {result.coverage_score:.2f}")
print(f"Conciseness Score: {result.conciseness_score:.2f}")
print(f"Logical Flow Score: {result.logical_flow_score:.2f}")

# Custom weights
weights = {
    "proximity": 0.4,
    "coverage": 0.3,
    "conciseness": 0.1,
    "logical_flow": 0.2
}
custom_result = calculate_relevance_score(
    question="What is Python?",
    answer="Python is a programming language.",
    weights=weights
)

Scoring Methodology

LlamaCalc evaluates relevance using four components:

Proximity Score (default weight: 0.35): Measures how directly the answer addresses the question using vector similarity.
Coverage Score (default weight: 0.30): Evaluates how well the answer covers key concepts from the question.
Conciseness Score (default weight: 0.15): Assesses whether the answer is appropriately detailed without being too verbose or too brief.
Logical Flow Score (default weight: 0.20): Analyzes the logical structure and coherence of the answer.

The total score is a weighted sum of these components. Scores range from 0.0 to 1.0, with higher scores indicating better relevance.

Advanced Usage

Batch Processing

Process multiple question-answer pairs:

from llamacalc import batch_calculate_relevance

qa_pairs = [
    ("What is Python?", "Python is a programming language."),
    ("How does a neural network work?", "Neural networks process data through layers of interconnected nodes."),
    # Add more pairs...
]

results = batch_calculate_relevance(qa_pairs, max_workers=4)

for result in results:
    print(f"Score: {result.total_score:.2f} - Q: {result.question[:30]}...")

Using Caching

from llamacalc import calculate_relevance_score
from llamacalc.cache import MemoryCache, DiskCache

# Memory cache (faster, not persistent)
memory_cache = MemoryCache(max_size=1000, ttl=3600)  # 1 hour TTL

# Check cache first
cached_result = memory_cache.get(question, answer)
if cached_result:
    print("Cache hit!")
    result = cached_result
else:
    # Calculate and cache
    result = calculate_relevance_score(question, answer)
    memory_cache.put(result)

# Persistent disk cache
disk_cache = DiskCache(cache_dir="~/.my_app_cache", ttl=86400*7)  # 1 week TTL
disk_cache.put(result)

Performance Optimization

LlamaCalc automatically uses MLX for accelerated computation if available:

from llamacalc.core import HAS_MLX

print(f"Using MLX acceleration: {HAS_MLX}")

For large batches, consider adjusting the number of worker threads:

from llamacalc import batch_calculate_relevance

# Use 8 worker threads for large batches
results = batch_calculate_relevance(large_qa_batch, max_workers=8)

Contributing

Contributions are welcome! Please check our Contributing Guidelines for more details.

License

LlamaCalc is licensed under the MIT License. See the LICENSE file for details.

Acknowledgments

LlamaCalc is developed by the [LlamaSearch.AI](https://llamasearch.ai team
Special thanks to the MLX team for their excellent framework
Logo design by LlamaSearch.AI

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Apr 4, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llamacalc_llamasearch-0.1.0.tar.gz (30.7 kB view details)

Uploaded Apr 4, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llamacalc_llamasearch-0.1.0-py3-none-any.whl (28.5 kB view details)

Uploaded Apr 4, 2025 Python 3

File details

Details for the file llamacalc_llamasearch-0.1.0.tar.gz.

File metadata

Download URL: llamacalc_llamasearch-0.1.0.tar.gz
Upload date: Apr 4, 2025
Size: 30.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for llamacalc_llamasearch-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`9245552f3687a694be2f071d4a30ed2bd70033372527128c84ea69cf2bb8b993`
MD5	`5efaf9e1905cfb50a7e18eb9e778dac0`
BLAKE2b-256	`fdb74dc754340ef1c1b909f66588d00362e207e0225f12d452183512e2ba6f79`

See more details on using hashes here.

File details

Details for the file llamacalc_llamasearch-0.1.0-py3-none-any.whl.

File metadata

Download URL: llamacalc_llamasearch-0.1.0-py3-none-any.whl
Upload date: Apr 4, 2025
Size: 28.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for llamacalc_llamasearch-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`49da0e0b2b48969fc09d4d0d4bccf7feb21812c2d0200926634daa8a5dd8fb64`
MD5	`24fa2926d08ccff087155a4b3dd7f33a`
BLAKE2b-256	`a510330f24153e2bafd05fc883eeed41697e4b95460c70fc7621a13fce0020fc`

See more details on using hashes here.

llamacalc-llamasearch 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

LlamaCalc

Features

Installation

Quick Start

Command Line

Python API

Scoring Methodology

Advanced Usage

Batch Processing

Using Caching

Performance Optimization

Contributing

License

Acknowledgments

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes