Advanced relevance scoring for LLM outputs
Project description
LlamaCalc
LlamaCalc is an advanced relevance scoring tool for evaluating how well answers address questions, specifically designed for assessing LLM-generated responses.

pip install llamacalc[mlx]
# With development dependencies
pip install llamacalc[dev]
Quick Start
Command Line
# Simple usage
llamacalc --question "What is Python?" --answer "Python is a high-level programming language known for its readability and versatility."
# Interactive mode
llamacalc --interactive
# Batch processing
llamacalc --batch-file qa_pairs.json --json --output-file results.json
# Custom weights
llamacalc -q "What is Python?" -a "Python is a programming language." --proximity-weight 0.4 --coverage-weight 0.3 --conciseness-weight 0.1 --logical-flow-weight 0.2
Python API
from llamacalc import calculate_relevance_score
# Calculate a score
result = calculate_relevance_score(
question="What is Python?",
answer="Python is a versatile programming language used in web development, data science, and AI."
)
# Display results
print(f"Total Score: {result.total_score:.2f}")
print(f"Proximity Score: {result.proximity_score:.2f}")
print(f"Coverage Score: {result.coverage_score:.2f}")
print(f"Conciseness Score: {result.conciseness_score:.2f}")
print(f"Logical Flow Score: {result.logical_flow_score:.2f}")
# Custom weights
weights = {
"proximity": 0.4,
"coverage": 0.3,
"conciseness": 0.1,
"logical_flow": 0.2
}
custom_result = calculate_relevance_score(
question="What is Python?",
answer="Python is a programming language.",
weights=weights
)
Scoring Methodology
LlamaCalc evaluates relevance using four components:
-
Proximity Score (default weight: 0.35): Measures how directly the answer addresses the question using vector similarity.
-
Coverage Score (default weight: 0.30): Evaluates how well the answer covers key concepts from the question.
-
Conciseness Score (default weight: 0.15): Assesses whether the answer is appropriately detailed without being too verbose or too brief.
-
Logical Flow Score (default weight: 0.20): Analyzes the logical structure and coherence of the answer.
The total score is a weighted sum of these components. Scores range from 0.0 to 1.0, with higher scores indicating better relevance.
Advanced Usage
Batch Processing
Process multiple question-answer pairs:
from llamacalc import batch_calculate_relevance
qa_pairs = [
("What is Python?", "Python is a programming language."),
("How does a neural network work?", "Neural networks process data through layers of interconnected nodes."),
# Add more pairs...
]
results = batch_calculate_relevance(qa_pairs, max_workers=4)
for result in results:
print(f"Score: {result.total_score:.2f} - Q: {result.question[:30]}...")
Using Caching
from llamacalc import calculate_relevance_score
from llamacalc.cache import MemoryCache, DiskCache
# Memory cache (faster, not persistent)
memory_cache = MemoryCache(max_size=1000, ttl=3600) # 1 hour TTL
# Check cache first
cached_result = memory_cache.get(question, answer)
if cached_result:
print("Cache hit!")
result = cached_result
else:
# Calculate and cache
result = calculate_relevance_score(question, answer)
memory_cache.put(result)
# Persistent disk cache
disk_cache = DiskCache(cache_dir="~/.my_app_cache", ttl=86400*7) # 1 week TTL
disk_cache.put(result)
Performance Optimization
LlamaCalc automatically uses MLX for accelerated computation if available:
from llamacalc.core import HAS_MLX
print(f"Using MLX acceleration: {HAS_MLX}")
For large batches, consider adjusting the number of worker threads:
from llamacalc import batch_calculate_relevance
# Use 8 worker threads for large batches
results = batch_calculate_relevance(large_qa_batch, max_workers=8)
Contributing
Contributions are welcome! Please check our Contributing Guidelines for more details.
License
LlamaCalc is licensed under the MIT License. See the LICENSE file for details.
Acknowledgments
- LlamaCalc is developed by the [LlamaSearch.AI](https://llamasearch.ai team
- Special thanks to the MLX team for their excellent framework
- Logo design by LlamaSearch.AI
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file llamacalc_llamasearch-0.1.0.tar.gz.
File metadata
- Download URL: llamacalc_llamasearch-0.1.0.tar.gz
- Upload date:
- Size: 30.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9245552f3687a694be2f071d4a30ed2bd70033372527128c84ea69cf2bb8b993
|
|
| MD5 |
5efaf9e1905cfb50a7e18eb9e778dac0
|
|
| BLAKE2b-256 |
fdb74dc754340ef1c1b909f66588d00362e207e0225f12d452183512e2ba6f79
|
File details
Details for the file llamacalc_llamasearch-0.1.0-py3-none-any.whl.
File metadata
- Download URL: llamacalc_llamasearch-0.1.0-py3-none-any.whl
- Upload date:
- Size: 28.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
49da0e0b2b48969fc09d4d0d4bccf7feb21812c2d0200926634daa8a5dd8fb64
|
|
| MD5 |
24fa2926d08ccff087155a4b3dd7f33a
|
|
| BLAKE2b-256 |
a510330f24153e2bafd05fc883eeed41697e4b95460c70fc7621a13fce0020fc
|