YiVal is an open-source project designed to revolutionize the way developers and researchers evaluate and refine AI models.
Project description
๐ง๐ปโ๏ธ YiVal
Website ยท Producthunt ยท Documentation
โก Build any Generative AI application with evaluation and improvement โก
๐ค What is YiVal?
YiVal is an GenAI-Ops framework that allows you to iteratively tune your Generative AI model metadata, params, prompts and retrieval configs all at once with your preferred choices of test dataset generation, evaluation algorithms and improvement strategies.
Check out our quickstart guide! โ
๐ฃ What's Next?
Expected Features in Sep
- Add ROUGE and BERTScore evaluators
- Add support to midjourney
- Add support to LLaMA2-70B, LLaMA2-7B, Falcon-40B,
- Support LoRA fine-tune to open source models
๐ Features
๐ง Experiment Mode: | ๐ค Agent Mode (Auto-prompting): | |
---|---|---|
Workflow | Define your AI/ML application โก๏ธ Define test dataset โก๏ธ Evaluate ๐ Improve โก๏ธ Prompt related artifacts built โ | Define your AI/ML application โก๏ธ Auto-prompting โก๏ธ Prompt related artifacts built โ |
Features | ๐ Streamlined prompt development process ๐ Support for multimedia and multimodel ๐ Support CSV upload and GPT4 generated test data ๐ Dashboard tracking latency, price and evaluator results ๐ Human(RLHF) and algorithm based improvers ๐ Service with detailed web view ๐ Customizable evaluators and improvers |
๐ Non-code experience of Gen-AI application build ๐ Witness your Gen-AI application born and improve with just one click |
Model Support matrix
We support 100+ LLM ( gpt-4 , gpt-3.5-turbo , llama e.g.).
Different Model sources can be viewed as follow
Model | llm-Evaluate | Human-Evaluate | Variation Generate | Custom func |
---|---|---|---|---|
OpenAI | โ | โ | โ | โ |
Azure | โ | โ | โ | โ |
TogetherAI | โ | โ | โ | โ |
Cohere | โ | โ | โ | โ |
Huggingface | โ | โ | โ | โ |
Anthropic | โ | โ | โ | โ |
MidJourney | โ | โ |
To support different models in custom func(e.g. Model Comparison) , follow our example
To support different models in evaluators and generators , check our config
Installation
pip install yival
Demo
Colab
Demo | Supported Features | Colab Link |
---|---|---|
๐ฏ Craft your AI story with ChatGPT and MidJourney | Multi-modal support of text and images. | |
๐ Evaluate different LLM Model Performance With Your Own Q&A Test Dataset | Easy model evaluation and comparison against 100+ models, thanks to LiteLLM. It provides a benchmark of model performances tailored to your customized use case or test data. | |
๐ฅ Startup Company Headline Generation Bot | Automate prompt evolution | |
๐งณ Build Your Customized Travel Guide Bot | Automate prompt generation by retrieving the most related popular prompt from the community. e.g. awesome-chatgpt-prompts | |
๐ Build a Cheaper Translator: Let GPT-4 Teach Llama2 to Create an Cheaper Translator | Use GPT-4-generated test data to fine-tune the translation bot of Llama2 with Replicate. 6% sacrifice in performance, 18x save in cost. | |
๐ค๏ธ Chat with Your Favorite Characters - ๆพนๅฐ็ฌ fromใ้ฟๆ็ฌๆใ | Give your character a soul with automated prompt generation and character scripts retrieval |
Multi-model Mode
Yival has multimodal capabilities and can handle generated images in AIGC really well.
Find more information in the Animal story demo we provided.
yival run demo/configs/animal_story.yml
Basic Interactive Mode
To get started with a demo for basic interactive mode of YiVal, run the following command:
yival demo --auto_prompts
Once started, navigate to the following address in your web browser:
http://127.0.0.1:8073/interactive
Click to view the screenshot
For more details on this demo, check out the Basic Interactive Mode Demo.
Question Answering with expected result evaluator
yival demo --qa_expected_results
Once started, navigate to the following address in your web browser: http://127.0.0.1:8073/
Click to view the screenshot
For more details, check out the Question Answering with expected result evaluator.
Automatically generate prompts with evaluator
yival demo --basic_interactive
Once started, navigate to the following address in your web browser: http://127.0.0.1:8073/
Click to view the screenshot
Contributors
๐ YiVal welcomes your contributions! ๐
๐ฅณ Thanks so much to all of our amazing contributors ๐ฅณ
Paper / Algorithm Implementation
Paper | Author | Topics | YiVal Contributor | Data Generator | Variation Generator | Evaluator | Selector | Evolver | Config |
---|---|---|---|---|---|---|---|---|---|
Large Language Models Are Human-Level Prompt Engineers | Yongchao Zhou, Andrei Ioan Muresanu, Ziwen Han | YiVal Evolver, Auto-Prompting | @Tao Feng | OpenAIPromptDataGenerator | OpenAIPromptVariationGenerator | OpenAIPromptEvaluator, OpenAIEloEvaluator | AHPSelector | OpenAIPromptBasedCombinationImprover | config |
BERTScore: Evaluating Text Generation with BERT | Tianyi Zhang, Varsha Kishore, Felix Wu | YiVal Evaluator, bertscore, rouge | @crazycth | - | - | BertScoreEvaluator | - | - | - |
AlpacaEval | Xuechen Li, Tianyi Zhang, Yann Dubois et. al | YiVal Evaluator | @Tao Feng | - | - | AlpacaEvalEvaluator | - | - | config |
Chain of Density | Griffin Adams Alexander R. Fabbri et. el | Prompt Engineering | @Tao Feng | ChainOfDensityGenerator | config |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.