Skip to main content

Open Chai

Project description

Guanaco Banner

Pull Requests Welcome first-timers-only Friendly

Chai Guanaco is part of the Chai Guanaco Competition, accelerating community AGI.

It's the world's first open community challenge with real-user evaluations. You models will be directly deployed on the Chai App where our over 500K daily active users will be providing live feedback. Get to top of the leaderboard and share the $1 million cash prize!

Quick Start

Chai Guanaco Jupyter Notebook Quickstart

The Guanaco Guide

🥇 Evaluation & Prizes: Depending on the phase of the competition, a suite of user-level evaluation metrics will be used (i.e. thumbs up / thumbs down rate). Your model will be ranked in real-time compared with other models, you can view the leaderboard at anytime with the pip package

🕵️ Real-time user feedback: After your model is deployed, it will go through an safety + integrity checker, once passed, it will be deployed directly to our users who will provide written feedback that you can view via the pip package.

🤖 Model requirements: Currently, we support any models based off LLaMA 2 (i.e. 7/13B parameters with LLaMA tokenizer). All you need to do is push your model directly to huggingface. Support for more model types is coming soon!

⚙️ Sampling parameters: During submission, we allow for custom model generation parameters such as temperature. Once your model is deployed on our platform, it will be using the parameters you've provided to generate chat completions.

📚 Rules: By default, 1 developer key per person, each key can deploy 1 model to users at a time. Message in discord if you would like the limit bumped up 😀

How Does It Work?

  • The chai_guanaco pip package provides a way to easily submit your language model, all you need to do is ensure it is on HuggingFace 🤗
  • We will automatically Tritonize your model for fast inference and host it in our internal GPU cluster 🚀
  • Once deployed, Chai users on our platform who enter the arena mode will be rating your model directly, providing you with both quantatitive and verbal feedback 📈
  • Both the public leaderboard and user feedback for your model can be directly downloaded via the chai_guanaco package 🧠
  • Cash prizes will be allocated according to your position in the leaderboard 💰

Chai Pipeline

🚀 Getting Started

Getting Developer Key

Join the competition discord, introduce yourself and ask for a developer key. Login-based authentication is coming next 🤗

Installation

Use pip to install the Chai Guanaco package

pip install chai-guanaco

For one-off authentication run the following in your terminal:

chai-guanaco login

And pass in your developer key when prompted, you can always logout using chai-guanaco logout.

Model Submission

Upload any Llama based language model with a tokenizer to huggingface, i.e. NousResearch/Llama-2-7b-chat-hf. Read this guide if you are unsure. Click the Use in Transformers button in huggingface to get the your huggingface model ID (i.e. "NousResearch/Llama-2-7b-chat-hf")

To submit model simply run:

import chai_guanaco as chai

model_url = "NousResearch/Llama-2-7b-chat-hf" # Your model URL

generation_params = {
	'temperature': 1.0,
	'repetition_penalty': 1.13,
	'top_p': 0.2,
	"top_k": 40,
	"stopping_words": ['\n'],
	"presence_penalty": 0.,
	"frequency_penalty": 0.
	}
submission_parameters = {'model_repo': model_url, 'generation_params': generation_params, 'model_name': 'my-awesome-llama'}

submitter = chai.ModelSubmitter()
submission_id = submitter.submit(submission_parameters)

This will display an animation while your model is being deployed, a typical deployment takes approximately 10 minutes. Note the model_name parameter is used for show-casing your model on the leaderboard and it should help you with identifying your model

Chat With Your Model Submission

Once your model is deployed, you can verify its behaviour and raw input by running:

chatbot = chai.SubmissionChatbot(submission_id)
chatbot.chat('nerd_girl', show_model_input=False)

Here you can have a dialog with one of the bots we have provided. To quit the chat, simply enter "exit". Note that, in order to prevent spamming, each model submission is limited to 1000 chat messages from the Chai Guanaco pip package.

You can get a list of avaliable bots by running:

chatbot.show_avaliable_bots()

Finally, to enter a chat session that prints out the raw input that was fed into your model at each turn of the conversation, you can run:

chatbot.chat('nerd_girl', show_model_input=True)

Getting User Feedback

Once your model has been submitted, it is automatically deployed to the Chai Platform where real-life users will evaluate your model performance. To view their feedback, run:

model_feedback = chai.get_feedback(submission_id)
model_feedback.sample()

Which will print out one of the users' conversation, together with meta information associated with the conversation (i.e. rating and user feedback).

To get all the feedback for your model, run...

df = model_feedback.df
print(df)

This outputs a Pandas DataFrame, where each row corresponds to a user conversation with your model, together with their feedback.

Getting Live Leaderboard

To view the public leaderboard used to determine prizes (which only shows the best model submitted by each developer):

df = chai.display_leaderboard(detailed=False)

To see how your model performs against other models, run:

df = chai.display_leaderboard(detailed=True)

which prints out the current leaderboard according to the most recent competition metrics, you can also access raw leaderboard is dumped to df

Re-Submitting Models

Because it is a competition, you are allowed to test a single model at any given time. However, you can deactivate a model and submit a new one. To do this, simply run:

chai.deactivate_model(submission_id)

Which will deactive your model, don't worry, all the model feedback will still be saved, it just means the model will no longer be exposed to users. You can then re-submit by repeating the model submission step.

Retrieve Your Model Submission IDs

In case you have forgotten your submission ids / want to view all past submissions, run:

submission_ids = chai.get_my_submissions()
print(submission_ids)

Here you will see all your model submission_ids along with their status, which is either failed, inactive or deployed.

Advanced Usage

  • This package caches various data, such as your developer key, in the folder ~/.chai-guanaco. To change this, you can set the environment variable GUANACO_DATA_DIR to point to a different folder. You may need to re-run chai-guanaco login to update the cached developer key.
  • You can also access the raw feedback data by running
     model_feedback = chai.get_feedback(submission_id)
     raw_data = model_feedback.raw_data
    
  • To submit your model with custom formatting, you can create your own PromptFormatter. For more details and examples, please see here.

Resources

📒 Fine tuning guide Guide on language model finetuning
💾 Datasets Curated list of open-sourced datasets to get started with finetuning
💖 Guanaco Discord Our Guanaco competition discord
🚀 Deepspeed Guide Guide for training with Deepspeed (faster training without GPU bottleneck)
💬 Example Conversations Here you can find 100 example conversations from the Chai Platform
⚒️ Build with us If you think what we are building is cool, join us!

🦙 Hosted & Sponsored By

Chai Logo

Coreweave Logo

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

openchai-0.0.0.tar.gz (1.2 MB view details)

Uploaded Source

File details

Details for the file openchai-0.0.0.tar.gz.

File metadata

  • Download URL: openchai-0.0.0.tar.gz
  • Upload date:
  • Size: 1.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.7.9

File hashes

Hashes for openchai-0.0.0.tar.gz
Algorithm Hash digest
SHA256 36b388cef68f77a36c927ee1d75ee209af99e9a6e2541879233888a4ac33057e
MD5 41e8cefa43e977f85ac771b6fa5ac38f
BLAKE2b-256 eeebc9595b07bd345f0aab963bc6ef899db100e970e7f2eb565af1bd7a80b43a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page