Skip to main content

Tools for fine tuning and serving LLMs

Project description

GitHub Repo stars

LLM-ATC (Air Traffic Controller) is a CLI for fine tuning and serving open source models using your own cloud credentials. We hope that this project can lower the cognitive overhead of orchestration for fine tuning and model serving.

Refer to the docs for the most up to date usage information. This README is updated less frequently

Installation

Follow the instructions here to install Skypilot and provide cloud credentials. We use Skypilot for cloud orchestration. Steps to setup an environment is shown below.

# create a fresh environment
conda create -n "sky" python=3.10 
conda activate sky

# install llm-atc
pip install llm-atc

# For Macs, macOS >= 10.15 has a conflict with grpcio
pip uninstall grpcio; conda install -c conda-forge grpcio=1.43.0 --force-reinstall

# Configure your cloud credentials. This is a GCP example. See https://skypilot.readthedocs.io/en/latest/getting-started/ installation.html for examples with other cloud providers.
pip install google-api-python-client
conda install -c conda-forge google-cloud-sdk
gcloud init
gcloud auth application-default login

# double check that your credentials are properly set for your desired provider(s)
sky check

From PyPi

pip install llm-atc

From source

pip install -e .

Finetuning

Supported fine-tune methods.

  • Vicuna-Llama (chat-finetuning)

To start finetuning a model. Use llm-atc train. For example

# start training
llm-atc train --model_type vicuna --finetune_data ./vicuna_test.json --name myvicuna --description "This is a finetuned model that just says its name is vicuna" -c mycluster --cloud gcp --envs "MODEL_SIZE=7 WANDB_API_KEY=<my wandb key>" --accelerator A100-80G:4

# shutdown cluster when done
sky down mycluster

If your client disconnects from the train, the train run will continue. You can check it's status with sky queue mycluster

When training completes, by default, your model, will be saved to an object store corresponding to the cloud provider which launched the training instance. For instance,

# s3 location
s3://llm-atc/myvicuna
# gcp location
g3://llm-atc/myvicuna

Serving

llm-atc can serve both models from HuggingFace or that you've trained through llm-atc serve. For example

# serve an llm-atc finetuned model, requires `llm-atc/` prefix and grabs model checkpoint from object store
llm-atc serve --name llm-atc/myvicuna --accelerator A100:1 -c servecluster --cloud gcp --region asia-southeast1 --envs "HF_TOKEN=<HuggingFace_token>"

# serve a HuggingFace model, e.g. `lmsys/vicuna-13b-v1.3`
llm-atc serve --name lmsys/vicuna-13b-v1.3 --accelerator A100:1 -c servecluster --cloud gcp --region asia-southeast1 --envs "HF_TOKEN=<HuggingFace_token>"

This creates a OpenAI API server on port 8000 of the cluster head and one model worker. Make a request from your laptop with.

# get the ip address of the OpenAI server
ip=$(grep -A1 "Host servecluster" ~/.ssh/config | grep "HostName" | awk '{print $2}')

# test which models are available
curl http://$ip:8000/v1/models

# stop model server cluster
sky stop servecluster

and you can connect to this server and develop your using your finetuned models with other LLM frameworks like LlamaIndex. Look at examples/ to see how to interact with your API endpoint.

Telemetry

By default, LLM-ATC collects anonymized data about when a train or serve request is made with PostHog. Telemetry helps us identify where users are engaging with LLM-ATC. However, if you would like to disable telemetry, set

export LLM_ATC_DISABLE=1

How does it work?

Training, serving, and orchestration are powered by SkyPilot, FastChat, and vLLM. We've made this decision since we believe this will allow people to train and deploy custom LLMs without cloud-lockin.

We currently rely on default hyperparameters from other training code repositories, but we will add options to overwrite these so that users have more control over training, but for now, we think the defaults should suffice for most use cases.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_atc-0.1.7.tar.gz (16.5 kB view details)

Uploaded Source

Built Distribution

llm_atc-0.1.7-py3-none-any.whl (17.9 kB view details)

Uploaded Python 3

File details

Details for the file llm_atc-0.1.7.tar.gz.

File metadata

  • Download URL: llm_atc-0.1.7.tar.gz
  • Upload date:
  • Size: 16.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.10.13 Darwin/22.3.0

File hashes

Hashes for llm_atc-0.1.7.tar.gz
Algorithm Hash digest
SHA256 c8236a0b5095eb268b1afa98ef0b5588ca0e8bdff88516010ce1ab55477573c3
MD5 b91e5cdad18bdb93a2a853a100bbb314
BLAKE2b-256 44ccfd409c94e6adabc7eb7accf8106939d2a6a0a4fed0f92144db2d2e3ebec1

See more details on using hashes here.

File details

Details for the file llm_atc-0.1.7-py3-none-any.whl.

File metadata

  • Download URL: llm_atc-0.1.7-py3-none-any.whl
  • Upload date:
  • Size: 17.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.10.13 Darwin/22.3.0

File hashes

Hashes for llm_atc-0.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 3630a7424e56436c409a542298e7890d094b5ab031e5cfe3c01164293c0097e0
MD5 0a6fba90bd0f8640d9396dc974eeefd0
BLAKE2b-256 4abb78688db1aae1f9b04ed726fb4c97b67e1898b680541b78657b4b2a815949

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page