A DSPy metric function learning package

These details have not been verified by PyPI

Project links

Project description

DSPy Metric Learning

A powerful package for learning and optimizing metric functions for DSPy, leveraging language models to create better evaluation metrics for your generative AI applications.

Note: This package is currently in pre-alpha stage. The API is likely to change significantly in future releases.

🌟 Features

LLM-based Evaluation: Define metric functions as DSPy modules using language models
Custom Scoring: Pass your preferred language models for rating predictions
Data Management: Store and manage scored outputs in an organized directory structure
Interactive Labeling: Simple REPL interface for human labeling of examples
Optimization: DSPy-powered optimization for metric function modules
Multi-metric Support: Create and manage multiple specialized metric functions
Comprehensive Testing: Extensive test suite with 92% code coverage

📦 Installation

⚠️ Pre-Alpha Release: This package is in very early development. The API is unstable and major architectural changes are expected.

pip install dspy-metric-learning

You can also install directly from the repository for the latest development version:

git clone https://github.com/tom-doerr/dspy_metric_learning.git
cd dspy_metric_learning
pip install -e .

🚀 Quick Start

import dspy
from metric_learner import MetricModule, MetricDataManager

# Initialize a language model
lm = dspy.OpenAI(model="gpt-3.5-turbo")

# Create a metric module
metric = MetricModule(lm=lm)

# Score a prediction
score = metric(
    input="What is the capital of France?",
    prediction="Paris is the capital of France.",
    gold="Paris"
)

print(f"Score: {score}")  # Output: Score: 0.92

📚 Usage Examples

1. Creating a Metric Module

from metric_learner import MetricModule

# Create with custom prompt template
metric = MetricModule(
    lm=lm,
    prompt_template=(
        "Rate the factual accuracy of the answer '{prediction}' "
        "for the question '{input}' on a scale from 0 to 1."
    )
)

2. Managing Data

from metric_learner import MetricDataManager

# Create a data manager
data_manager = MetricDataManager(metric_name="factual_accuracy")

# Save an instance
data_manager.save_instance(
    input="What is the tallest mountain?",
    prediction="Mount Everest is the tallest mountain on Earth.",
    gold="Mount Everest",
    score=0.9
)

# Load instances
instances = data_manager.load_instances()

3. Optimizing a Metric

from metric_learner import optimize_metric_module

# Get labeled dataset
dataset = data_manager.get_labeled_dataset()

# Optimize the metric
optimized_metric = optimize_metric_module(metric, dataset)

🔍 Examples

The examples/ directory contains several example scripts:

Example	Description
`basic_usage.py`	Simple demonstration of core functionality
`multiple_metrics.py`	Using multiple specialized metrics
`streamlit_app.py`	Interactive web interface for labeling and optimization
`complete_workflow.py`	End-to-end workflow from data collection to optimization

Run the complete workflow example:

python examples/complete_workflow.py

Run the Streamlit app (in headless mode):

streamlit run examples/streamlit_app.py --server.headless=true

📖 API Reference

Core Components

Component	Description
`MetricModule`	Core class for defining and using metric functions
`MetricDataManager`	Manages storage and retrieval of labeled instances
`optimize_metric_module`	Function to optimize a metric module using labeled data
`MetricEvaluator`	Evaluates the performance of a metric module
`label_instances`	Interactive REPL interface for labeling instances

MetricModule

class MetricModule(dspy.Module):
    """Module for evaluating predictions using a language model."""

Parameters:

lm: Language model to use for scoring
prompt_template: Optional custom prompt template for the metric
demonstrations: Optional list of demonstration examples

Methods:

__call__(input, prediction, gold=None): Score a prediction

MetricDataManager

class MetricDataManager:
    """Manages storage and retrieval of metric data."""

Parameters:

metric_name: Name of the metric
data_dir: Optional directory for storing data

Methods:

save_instance(input, prediction, gold=None, score=None): Save an instance
load_instances(): Load all instances
update_user_score(datetime, score): Update user score for an instance
get_labeled_dataset(): Get a dataset of labeled instances

Optimization Functions

# Optimize a metric module
optimized_module = optimize_metric_module(
    metric_module,    # MetricModule to optimize
    dataset,          # Dataset of labeled examples
    metric_fn=None,   # Optional custom metric function
    optimizer_class=None  # Optional custom optimizer class
)

# Evaluate a metric module
evaluator = MetricEvaluator(metric_module, data_manager)
metrics = evaluator.evaluate()  # Returns MSE, correlation, etc.

Interactive Labeling

# Start an interactive labeling session
label_instances(
    data_manager,     # MetricDataManager instance
    quit_after=None,  # Optional number of instances to label
    skip_labeled=True # Whether to skip already labeled instances
)

🧪 Testing

Run unit tests:

python -m pytest tests/

Run integration tests:

python -m pytest integration_tests/

Run specific test categories:

python -m pytest -m "integration and not slow"

👥 Contributing

We welcome contributions! Please see CONTRIBUTING.md for guidelines.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0rc1 pre-release

Mar 4, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dspy_metric_learning-0.1.0rc1.tar.gz (8.8 kB view details)

Uploaded Mar 4, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dspy_metric_learning-0.1.0rc1-py3-none-any.whl (10.9 kB view details)

Uploaded Mar 4, 2025 Python 3

File details

Details for the file dspy_metric_learning-0.1.0rc1.tar.gz.

File metadata

Download URL: dspy_metric_learning-0.1.0rc1.tar.gz
Upload date: Mar 4, 2025
Size: 8.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.4 CPython/3.11.10 Linux/5.15.0-131-generic

File hashes

Hashes for dspy_metric_learning-0.1.0rc1.tar.gz
Algorithm	Hash digest
SHA256	`ce519fa67babce0c12876ecf2367fd96f871c106c7790ac6371bb8c1928520dc`
MD5	`7583ff57f5dd0e8cb26a66994e598fc1`
BLAKE2b-256	`365c83b1b05f986686c9e97907c2e03666cdd8b6236d97a17a3af60f2e6d40e4`

See more details on using hashes here.

File details

Details for the file dspy_metric_learning-0.1.0rc1-py3-none-any.whl.

File metadata

Download URL: dspy_metric_learning-0.1.0rc1-py3-none-any.whl
Upload date: Mar 4, 2025
Size: 10.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/1.8.4 CPython/3.11.10 Linux/5.15.0-131-generic

File hashes

Hashes for dspy_metric_learning-0.1.0rc1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`b37ff0d71a98bfa3bf54e1e8a47d9f53f99a26a5b8d1ca440ede3e9c15209353`
MD5	`b8158a01b443d915705b610b7b9ec70c`
BLAKE2b-256	`ef3fb6cd1d8d66575cc750b87fc5cf984fe96c8d9622811003f499e8d5176568`

See more details on using hashes here.

dspy-metric-learning 0.1.0rc1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

DSPy Metric Learning

🌟 Features

📋 Table of Contents

📦 Installation

🚀 Quick Start

📚 Usage Examples

1. Creating a Metric Module

2. Managing Data

3. Optimizing a Metric

🔍 Examples

📖 API Reference

Core Components

MetricModule

MetricDataManager

Optimization Functions

Interactive Labeling

🧪 Testing

👥 Contributing

📄 License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes