Effiently serve LoRA tuned models
Project description
Frequency
Efficiently serve LoRA tuned models.
Frequency provides a means to hot-swap LoRA layers in ML models at the time of inference allowing for the efficient usage of large base models.
Install
pip install frequency-ai
Install server component on Kubernetes
helm install frequency oci://artifact.frequency.ai/frequency-server:0.0.1
Usage
Load a HuggingFace model and use adapters
from transformers import AutoModelForCausalLM, AutoTokenizer
from frequency import Client
# Connect to the frequency server
client = Client("localhost:9000")
# Load an hf model onto the server
model = client.load_model(name="qwen-vl-chat", hf_repo="Qwen/Qwen-VL-Chat", type=AutoModelForCausalLM)
# Cache an adapter on the server that was trained on dog images
resp = model.cache_adapter(name="dog", hf_repo="Anima-ai/dog_lora")
# Qwen expects a specific format for describing images
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen-VL-Chat", trust_remote_code=True)
query = tokenizer.from_list_format([
{'image': 'https://hips.hearstapps.com/ghk.h-cdn.co/assets/17/30/pembroke-welsh-corgi.jpg'},
{'text': 'What is this?'},
])
# Chat with the model using the dog adapter
response, history = model.chat(query=query, adapters=["dog"])
#> Here is a picture of a Corgi
# Cache an adapter on the server that was trained on cat images
resp = model.cache_adapter(name="cat", hf_repo="Anima-ai/cat_lora")
print(resp)
query = tokenizer.from_list_format([
{'image': 'https://www.catster.com/wp-content/uploads/2023/11/Brown-tabby-cat-that-curls-up-outdoors_viper-zero_Shutterstock-800x533.jpg'},
{'text': 'What is this?'},
])
# Chat with the same model using the new cat adapter
response, history = model.chat(query=query, adapters=["cat"])
#> Here is a picture of a tabby cat
Roadmap
- Tenancy
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file frequency_ai-0.1.8.tar.gz.
File metadata
- Download URL: frequency_ai-0.1.8.tar.gz
- Upload date:
- Size: 42.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.5.1 CPython/3.10.1 Darwin/22.6.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
53c2abb09ba2e2bfc586532d52e63820f9969dffec5b4e30d8f4e9a9956bb523
|
|
| MD5 |
141fa75f706095794e8da0c170672ae8
|
|
| BLAKE2b-256 |
254dae6bd7d28e4ce094ba9a46653d0e03781608a45d91182685f614817a04b9
|
File details
Details for the file frequency_ai-0.1.8-py3-none-any.whl.
File metadata
- Download URL: frequency_ai-0.1.8-py3-none-any.whl
- Upload date:
- Size: 55.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.5.1 CPython/3.10.1 Darwin/22.6.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
51b26bc9621fe67d03ba4ce7b6cd37f3d9a7fc4c4a6d2ee56c95ee2313cff51d
|
|
| MD5 |
000c05bd6987cfca4be13eca3c183629
|
|
| BLAKE2b-256 |
7ee10322fccfbe449ecde268f9c0816684b896dd1cda9dd39a195aba493c7f21
|