An open platform for training, serving, and evaluating large language model based chatbots.

These details have not been verified by PyPI

Project links

Project description

FastChat

An open platform for training, serving, and evaluating large language model based chatbots.

Release

🔥 We released Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90% ChatGPT Quality. Checkout the blog post and demo.

Join our Discord server and follow our Twitter to get the latest updates.

Announcement: Thank you for checking out our project and your interest! We plan to release the model weights once we have addressed all legal concerns and have a low-resource version of the inference code ready. Based on our current timeline, it will be available by early next week. Please stay tuned! :llama:

Install
Serving
Evaluation
Fine-tuning

Install

Clone this repository and navigate to FastChat folder

git clone https://github.com/lm-sys/FastChat.git
cd FastChat

Install Package

pip3 install --upgrade pip  # enable PEP 660 support
pip3 install -e .

Install the latest main branch of huggingface/transformers

pip3 install git+https://github.com/huggingface/transformers

Serving

We plan to release the model weights by providing a version of delta weights that build on the original LLaMA weights, but we are still figuring out a proper way to do so. In this example, we demonstrate the usage of our distributed serving system using OPT models. Later, you can apply similar commands to serve Vicuna, just as shown in our demo.

Command Line Interface

python3 -m fastchat.serve.cli --model-name facebook/opt-1.3b

Web UI

Launch a controller

python3 -m fastchat.serve.controller

Launch a model worker

python3 -m fastchat.serve.model_worker --model-path facebook/opt-1.3b

Send a test message

python3 -m fastchat.serve.test_message

Launch a gradio web server.

python3 -m fastchat.serve.gradio_web_server

You can open your browser and chat with a model now.

Evaluation

Our AI-enhanced evaluation pipeline is based on GPT-4. Here are some high-level instructions for using the pipeline:

First, generate answers from different models. Use qa_baseline_gpt35.py for ChatGPT, or specify the model checkpoint and run model_qa.py for Vicuna and other models.

Then, use GPT-4 to generate reviews automatically, which can be done manually if the GPT-4 API is not available to you. Once you have your evaluation data, visualize the results by running generate_webpage_data_from_table.py, which generates data for a static website.

Finally, serve a static website under the webpage directory. You can simply use python3 -m http.server to serve the website locally.

Besides the evaluation workflow, we also document the data format used for evaluation, which is encoded with JSON Lines and includes information on models, prompts, reviewers, questions, answers, and reviews. You can customize the evaluation process or contribute to our project by accessing relevant data.

Check evaluation for detailed instructions.

Fine-tuning

Data

Vicuna is created by fine-tuning a LLaMA base model using approximately 70K user-shared conversations gathered from ShareGPT.com with public APIs. To ensure data quality, we convert the HTML back to markdown and filter out some inappropriate or low-quality samples. Additionally, we divide lengthy conversations into smaller segments that fit the model's maximum context length. For detailed instructions to clean the ShareGPT data, check out here.

Due to some concerns, we may not release the data at the moment. If you would like to try the fine-tuning code, you can try to run it with our preprocessed alpaca dataset (originally from here).

Code and Hyperparameters

We fine-tune the model using the code from Stanford Alpaca, with some modifications to support gradient checkpointing and Flash Attention. We use similar hyperparameters as the Stanford Alpaca.

Hyperparameter	Global Batch Size	Learning rate	Epochs	Max length	Weight decay
Vicuna-13B	128	2e-5	3	2048	0

Fine-tuning on Any Cloud with SkyPilot

SkyPilot is a framework built by UC Berkeley for easily and cost effectively running ML workloads on any cloud (AWS, GCP, Azure, Lambda, etc.). To use SkyPilot, install it with the following command and setup the cloud credentials locally following the instructions here.

# Install skypilot from the master branch
pip install git+https://github.com/skypilot-org/skypilot.git

Vicuna

Vicuna can be trained on 8 A100 GPUs with 80GB memory. The following command will automatically launch a node satisfying the requirement, setup and run the training job on it.

sky launch -c vicuna -s scripts/train-vicuna.yaml --env WANDB_API_KEY

Other options are also valid:

# Launch it on managed spot to save 3x cost (train Vicuna-13B with around $300)
sky spot launch -n vicuna scripts/train-vicuna.yaml --env WANDB_API_KEY

# Train a 7B model
sky launch -c vicuna -s scripts/train-vicuna.yaml --env WANDB_API_KEY --env MODEL_SIZE=7

Note: Please make sure the WANDB_API_KEY has been setup on your local machine. You can find the API key on your wandb profile page. If you would like to train the model without using wandb, you can replace the --env WANDB_API_KEY flag with --env WANDB_MODE=offline.

Alpaca

Launch the training job with the following line (will be launched on a single node with 4 A100-80GB GPUs)

sky launch -c alpaca -s scripts/train-alpaca.yaml --env WANDB_API_KEY

Fine-tuning with Local GPUs

Vicuna can also be trained on 8 A100 GPUs with 80GB memory with the following code. To train on fewer GPUs, you can reduce the per_device_train_batch_size and increase the gradient_accumulation_steps accordingly to keep the global batch size the same. To setup the environment, please see the setup section in scripts/train-vicuna.yaml.

torchrun --nnodes=1 --nproc_per_node=8 --master_port=<your_random_port> \
    fastchat/train/train_mem.py \
    --model_name_or_path <path-to-llama-model-weight> \
    --data_path <path-to-data> \
    --bf16 True \
    --output_dir ./checkpoints \
    --num_train_epochs 3 \
    --per_device_train_batch_size 4 \
    --per_device_eval_batch_size 4 \
    --gradient_accumulation_steps 1 \
    --evaluation_strategy "no" \
    --save_strategy "steps" \
    --save_steps 1200 \
    --save_total_limit 100 \
    --learning_rate 2e-5 \
    --weight_decay 0. \
    --warmup_ratio 0.03 \
    --lr_scheduler_type "cosine" \
    --logging_steps 1 \
    --fsdp "full_shard auto_wrap" \
    --fsdp_transformer_layer_cls_to_wrap 'LlamaDecoderLayer' \
    --tf32 True \
    --model_max_length 2048 \
    --gradient_checkpointing True \
    --lazy_preprocess True

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.36

Feb 11, 2024

0.2.35

Jan 17, 2024

0.2.34

Dec 9, 2023

0.2.33

Nov 22, 2023

0.2.32

Nov 1, 2023

0.2.31

Oct 14, 2023

0.2.30

Oct 2, 2023

0.2.29

Sep 20, 2023

0.2.28

Sep 11, 2023

0.2.27

Sep 10, 2023

0.2.26

Aug 28, 2023

0.2.24

Aug 13, 2023

0.2.23

Aug 2, 2023

0.2.21

Aug 1, 2023

0.2.20

Jul 21, 2023

0.2.18

Jul 5, 2023

0.2.17

Jul 2, 2023

0.2.16

Jun 29, 2023

0.2.15

Jun 18, 2023

0.2.14

Jun 14, 2023

0.2.13

Jun 12, 2023

0.2.12

Jun 9, 2023

0.2.11

May 24, 2023

0.2.10

May 21, 2023

0.2.9

May 15, 2023

0.2.8

May 12, 2023

0.2.7

May 8, 2023

0.2.6

May 8, 2023

0.2.5

Apr 30, 2023

0.2.4

Apr 28, 2023

0.2.3

Apr 21, 2023

0.2.2

Apr 16, 2023

0.2.1

Apr 12, 2023

0.2.0

Apr 12, 2023

0.1.10

Apr 11, 2023

0.1.9

Apr 8, 2023

0.1.8

Apr 7, 2023

0.1.7

Apr 6, 2023

0.1.6

Apr 6, 2023

0.1.5

Apr 6, 2023

0.1.4

Apr 6, 2023

0.1.3

Apr 3, 2023

This version

0.1.2

Apr 3, 2023

0.1.1

Mar 31, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fschat-0.1.2.tar.gz (41.9 kB view details)

Uploaded Apr 3, 2023 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fschat-0.1.2-py3-none-any.whl (49.1 kB view details)

Uploaded Apr 3, 2023 Python 3

File details

Details for the file fschat-0.1.2.tar.gz.

File metadata

Download URL: fschat-0.1.2.tar.gz
Upload date: Apr 3, 2023
Size: 41.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for fschat-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`714448cbacb6e49a3d1b469a46834343047a400215177309bc2c46921ee567fb`
MD5	`4968c2593eaafc82629e6cde31cf36f3`
BLAKE2b-256	`d83e380dc18642957eb86080d60a6cb1f65a7b53d57fd5ca865453282181d6be`

See more details on using hashes here.

File details

Details for the file fschat-0.1.2-py3-none-any.whl.

File metadata

Download URL: fschat-0.1.2-py3-none-any.whl
Upload date: Apr 3, 2023
Size: 49.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.9.16

File hashes

Hashes for fschat-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a451422fe250d0578c26b8798823976eab7817c6d2aaab189c388c4734107ff9`
MD5	`3befb79c1de729909edf15015985100e`
BLAKE2b-256	`1fefd935a7d4aad800fc670930c4bf70aa466a127b988810a68179570badc360`

See more details on using hashes here.

fschat 0.1.2

Navigation

Verified details

Owner

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

FastChat

Release

Contents

Install

Serving

Command Line Interface

Web UI

Launch a controller

Launch a model worker

Send a test message

Launch a gradio web server.

You can open your browser and chat with a model now.

Evaluation

Fine-tuning

Data

Code and Hyperparameters

Fine-tuning on Any Cloud with SkyPilot

Vicuna

Alpaca

Fine-tuning with Local GPUs

Project details

Verified details

Owner

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes