Skip to main content

A Python framework designed for both generating and evaluating hints.

Project description

PyPI Downloads

HintEval is a powerful framework designed for both generating and evaluating hints. These hints serve as subtle clues, guiding users toward the correct answer without directly revealing it. As the first tool of its kind, HintEval allows users to create and assess hints from various perspectives.

🖥️ Installation

It's recommended to install HintEval in a virtual environment using Python 3.11.9. If you're not familiar with Python virtual environments, check out this user guide. Alternatively, you can create a new environment using Conda.

Set up the virtual environment

First, create and activate a virtual environment with Python 3.11.9:

conda create -n hinteval_env python=3.11.9 --no-default-packages
conda activate hinteval_env

Install PyTorch 2.4.0

You'll need PyTorch 2.4.0 for HintEval. Refer to the PyTorch installation page for platform-specific installation commands. If you have access to GPUs, it's recommended to install the CUDA version of PyTorch, as many of the evaluation metrics are optimized for GPU use.

Install HintEval

Once PyTorch 2.4.0 is installed, you can install HintEval via pip:

pip install hinteval

For the latest features, you can install the most recent version from the main branch:

pip install git+https://github.com/DataScienceUIBK/HintEval

🏃 Quickstart

This is a small example program you can run to see hinteval in action!

from hinteval.cores import Instance, Question, Hint, Answer
from hinteval.evaluation.convergence import LlmBased

llm = LlmBased(model_name='llama-3-70b', together_ai_api_key='your_api_key', enable_tqdm=True)
instance_1 = Instance(
    question=Question('What is the capital of Austria?'),
    answers=[Answer('Vienna')],
    hints=[Hint('This city, once home to Mozart and Beethoven, is the capital of Austria.')])
instance_2 = Instance(
    question=Question('Who was the president of USA in 2009?'),
    answers=[Answer('Barack Obama')],
    hints=[Hint('He was the first African-American president in U. S. history.')])
instances = [instance_1, instance_2]
results = llm.evaluate(instances)
print(results)
# [[0.91], [1.0]]
metrics = [f'{metric_key}: {metric_value.value}' for
       instance in instances
       for hint in instance.hints for metric_key, metric_value in
       hint.metrics.items()]
print(metrics)
# ['convergence-llm-llama-3-70b: 0.91', 'convergence-llm-llama-3-70b: 1.0']
scores = [hint.metrics['convergence-llm-llama-3-70b'].metadata['scores'] for inst in instances for hint in inst.hints]
print(scores[0])
# {'Salzburg': 1, 'Graz': 0, 'Innsbruck': 0, 'Linz': 0, 'Klagenfurt': 0, 'Bregenz': 0, 'Wels': 0, 'St. Pölten': 0, 'Eisenstadt': 0, 'Sankt Johann impong': 0, 'Vienna': 1}
print(scores[1])
# {'George W. Bush': 0, 'Bill Clinton': 0, 'Jimmy Carter': 0, 'Donald Trump': 0, 'Joe Biden': 0, 'Ronald Reagan': 0, 'Richard Nixon': 0, 'Gerald Ford': 0, 'Franklin D. Roosevelt': 0, 'Theodore Roosevelt': 0, 'Barack Obama': 1}

Refer to our documentation to learn more.

🤝Contributors

Community contributions are essential to our project, and we value every effort to improve it. From bug fixes to feature enhancements and documentation updates, your involvement makes a big difference, and we’re thrilled to have you join us! For more details, please refer to development.

How to Add Your Own Dataset

If you have a dataset on hints that you'd like to share with the community, we'd love to help make it available within HintEval! Adding new, high-quality datasets enriches the framework and supports other users' research and study efforts.

To contribute your dataset, please reach out to us. We’ll review its quality and suitability for the framework, and if it meets the criteria, we’ll include it in our preprocessed datasets, making it readily accessible to all users.

To view the available preprocessed datasets, use the following code:

from hinteval import Dataset

available_datasets = Dataset.available_datasets(show_info=True, update=True)

Thank you for considering this valuable contribution! Expanding HintEval's resources with your work benefits the entire community.

How to Contribute

Follow these steps to get involved:

  1. Fork this repository to your GitHub account.

  2. Create a new branch for your feature or fix:

    git checkout -b feature/YourFeatureName
    
  3. Make your changes and commit them:

    git commit -m "Add YourFeatureName"
    
  4. Push the changes to your branch:

    git push origin feature/YourFeatureName
    
  5. Submit a Pull Request to propose your changes.

Thank you for helping make this project better!

🪪License

This project is licensed under the Apache-2.0 License - see the LICENSE file for details.

🙏Acknowledgments

Thanks to our contributors and the University of Innsbruck for supporting this project.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hinteval-0.0.1.tar.gz (1.3 MB view details)

Uploaded Source

Built Distribution

hinteval-0.0.1-py3-none-any.whl (89.2 kB view details)

Uploaded Python 3

File details

Details for the file hinteval-0.0.1.tar.gz.

File metadata

  • Download URL: hinteval-0.0.1.tar.gz
  • Upload date:
  • Size: 1.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for hinteval-0.0.1.tar.gz
Algorithm Hash digest
SHA256 a8e2fcc7d5a9b8fa59b67298a3cd610af2805da707764be7bdf55736aa3f82bf
MD5 a54347055463c847bf9377a937009916
BLAKE2b-256 64744b5cf67ee19ae7987993d4e98fe6d858bbb1cb9baf011c7a7f663ad9b03e

See more details on using hashes here.

File details

Details for the file hinteval-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: hinteval-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 89.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for hinteval-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 f4523fe27720b225389e75053d27ea110540b31026c366cfc325a1dd53189b37
MD5 045b2fdc958ec773dc1cbf3ce9c71330
BLAKE2b-256 0470d7f6e51ed25c7af0dae795a6560eb7b0df951da7b1b5ab38090a6294784a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page