ReFeR: Improving Evaluation and Reasoning through Hierarchy of Models

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3
- Python :: 3.12

Project description

ReFeR: Reason Feedback Review

ReFeR (Reason Feedback Review) is a LLM or VLM Agents framework for conducting comprehensive evaluations or reasoning using a peer review mechanism and a Hierarchy of Models. It allows for setting up multiple peer models and AC (Area Chair) models, with options to set prompts, hyperparameters, and control over the number of peers and ACs.

Key Features

Multi-Platform Support: Integrates with multiple AI platforms including OpenAI, Mistral, TogetherAI, Google (Gemini), and Groq.
Flexible Model Configuration: Easily set up multiple peer models and AC models with customizable parameters.
Optimized Prompt Generation: Utilizes AutoPrompt to generate optimized prompts based on user-provided task instructions and examples.
Batch Processing: Supports batch inference with optional multi-threading for improved performance.
Multimodal Capabilities: Handles both text and image inputs for versatile tasks (currently only supports OpenAI and Google models for multimodal inputs).
Customizable Response Processing: Allows for regex patterns or custom functions to process peer responses before passing them to the AC model.
Comprehensive Logging: Detailed logging and error handling for easy debugging and monitoring.

Installation

You can install ReFeR directly through pip or from this repository:

pip install refer-agents

git clone https://github.com/yaswanth-iitkgp/ReFeR
cd refer
pip install .

Requirements

Python 3.12 or later
API keys for supported platforms (OpenAI, Mistral, TogetherAI, Google, Groq)

Basic Usage

Here's a simple example of how to use ReFeR:

from refer_agents.core import ReFeR

# Initialize ReFeR
refer = ReFeR(log_level='INFO')

# Set API keys
refer.set_api_key('openai', 'your-openai-api-key')
refer.set_api_key('mistral', 'your-mistral-api-key')
refer.set_api_key('togetherai', 'your-togetherai-api-key')
refer.set_api_key('groq', 'your-groq-api-key')

# Configure models
refer.set_num_peers(3)
refer.set_num_acs(1)

refer.add_peer(model_name='meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo', platform='togetherai')
refer.add_peer(model_name='open-mistral-nemo', platform='mistral')
refer.add_peer(model_name='gemma2-9b-it', platform='groq')

refer.set_ac_model(model_name='gpt-4o-mini', platform='openai')

# Set prompts and generate optimized versions
prompt = "Your peer prompt here"
refer.set_peer_prompt(prompt)

optimized_peer_prompt, optimized_ac_prompt = refer.generate_optimized_prompts()

#skip optimization if you already have optimized prompts.
#always include placeholder in peer prompt as {{user_input}} and for AC prompt as {{user_input}} ,{{peer_response}}
optimized_peer_prompt = "Your optimized peer prompt here"
optimized_ac_prompt = "Your optimized AC prompt here"

# Run inference
user_input = "The content to be evaluated"
result = refer.infer(user_input, optimized_peer_prompt, optimized_ac_prompt)
print(result)

# Run batch inference
user_inputs = ["Input 1", "Input 2", "Input 3"]
results = refer.batch_infer(user_inputs, optimized_peer_prompt, optimized_ac_prompt, use_threading=True, max_workers=4, output_file='results.json')

Advanced Usage

Multimodal Evaluation

ReFeR supports multimodal inputs, allowing you to evaluate image-text pairs:

# Configure multimodal models
refer.add_peer(model_name='gpt-4o-mini', platform='openai')
refer.add_peer(model_name='gemini-1.5-flash', platform='google')

refer.set_ac_model(model_name='gpt-4o', platform='openai')

#use your optimized prompts
#always include placeholder in peer prompt as {{user_input}} and for AC prompt as {{user_input}} ,{{peer_response}}
optimized_peer_prompt = "Your optimized peer prompt here"
optimized_ac_prompt = "Your optimized AC prompt here"

# Prepare inputs
inputs = ["Text description 1", "Text description 2"]
image_paths = ["path/to/image1.jpg", "path/to/image2.jpg"]

# Run multimodal batch inference
results = refer.batch_infer_multimodal(
    inputs, 
    image_paths, 
    optimized_peer_prompt, 
    optimized_ac_prompt, 
    sleep_time=1, 
    output_file='multimodal_results.json'
)

Custom Response Processing

You can set a custom function to process peer responses:

def custom_processor(response):
    # Your custom processing logic here for processing peer responses before passing them to AC.
    return processed_response

refer.set_peer_response_processing_function(custom_processor)

Setting AC Mode

Choose between 'Lite' and 'Turbo' modes for the AC model:

refer.set_ac_mode('Lite')  # or 'Turbo' (turbo is only supported for openai models as Area Chair and it generates 20 (by default)responses for AC.)

Setting Hyperparameters

You can set hyperparameters for the AC model:

refer.set_hyperparameters(temperature=0.7)

Example Use Cases

ReFeR can be applied to various evaluation tasks, such as:

Mathematical Problem Solving: Evaluate solutions to complex math problems (see example_usage_gsm8k.py).
Conversational Engagement: Rate the engagingness of responses in a conversation (see example_usage_topicalchat.py).
Image-Text Alignment: Assess how well text descriptions match given images (see example_multimodal.py).

Error Handling and Logging

ReFeR includes comprehensive error handling and logging. Set the logging level when initializing:

refer = ReFeR(log_level='INFO')  # Options: 'INFO', 'WARNING', 'ERROR'

Contributing

We welcome contributions! For major changes, please open an issue first to discuss what you'd like to change.

License

MIT

Credits

The codebase was developed by Yaswanth Narsupalli and Sreevatsa Muppirala.

For any issues, doubts, or questions regarding the codebase, please feel free to contact us (yasshu.yaswanth@gmail.com, sreevatsa2002@gmail.com). We are here to help and would be happy to assist you with any concerns or clarifications you may need.

Citation

If you use this software in your research, please cite the paper as follows:

@misc{narsupalli2024reviewfeedbackreasonrefernovelframework,
    title={ReFeR: Improving Evaluation and Reasoning through Hierarchy of Models},
    author={Yaswanth Narsupalli and Abhranil Chandra and Sreevatsa Muppirala and Manish Gupta and Pawan Goyal},
    year={2024},
    eprint={2407.12877},
    archivePrefix={arXiv},
    primaryClass={cs.CL},
    url={https://arxiv.org/abs/2407.12877},
}

Project details

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3
- Python :: 3.12

Release history Release notifications | RSS feed

This version

0.1.4

Oct 3, 2024

0.1.2 yanked

Oct 3, 2024

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

refer-agents-0.1.4.tar.gz (17.0 kB view details)

Uploaded Oct 3, 2024 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

refer_agents-0.1.4-py3-none-any.whl (18.6 kB view details)

Uploaded Oct 3, 2024 Python 3

File details

Details for the file refer-agents-0.1.4.tar.gz.

File metadata

Download URL: refer-agents-0.1.4.tar.gz
Upload date: Oct 3, 2024
Size: 17.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.10.11

File hashes

Hashes for refer-agents-0.1.4.tar.gz
Algorithm	Hash digest
SHA256	`0cc7e23a257423988ed47e603a0e588463c1b1eda6c4fc46c1d13f69ca8c9464`
MD5	`438773c86e49debd8e23f37c0461dabd`
BLAKE2b-256	`6f3c70a4ecbd62af50992906072f76fd9740a4bdfef57c0a1f62e0e46d5f88b4`

See more details on using hashes here.

File details

Details for the file refer_agents-0.1.4-py3-none-any.whl.

File metadata

Download URL: refer_agents-0.1.4-py3-none-any.whl
Upload date: Oct 3, 2024
Size: 18.6 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/5.1.1 CPython/3.10.11

File hashes

Hashes for refer_agents-0.1.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`10c9741f5f823ffb277c911fdaa9e2456ea82b735efbcce8af678a24a3d3117a`
MD5	`5d6c79f4e4ed4fd8fe861354a8b18ed5`
BLAKE2b-256	`9e0a67e7c5759d132c2d465b06d66ad0b81be74ed6615e532b953062d767928f`

See more details on using hashes here.

refer-agents 0.1.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

ReFeR: Reason Feedback Review

Key Features

Installation

Requirements

Basic Usage

Advanced Usage

Multimodal Evaluation

Custom Response Processing

Setting AC Mode

Setting Hyperparameters

Example Use Cases

Error Handling and Logging

Contributing

License

Credits

Citation

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes