Effiently serve LoRA tuned models
Project description
Frequency
Efficiently serve LoRA tuned models.
Frequency provides a means to hot-swap LoRA layers in ML models at the time of inference allowing for the efficient usage of large base models.
Install
pip install frequency-ai
Install server component on Kubernetes
helm install frequency oci://artifact.frequency.ai/frequency-server:0.0.1
Usage
Load a HuggingFace model and use adapters
from transformers import AutoModelForCausalLM, AutoTokenizer
from frequency import Client
# Connect to the frequency server
client = Client("localhost:9000")
# Load an hf model onto the server
model = client.load_model(name="qwen-vl-chat", hf_repo="Qwen/Qwen-VL-Chat", type=AutoModelForCausalLM)
# Cache an adapter on the server that was trained on dog images
resp = model.cache_adapter(name="dog", hf_repo="Anima-ai/dog_lora")
# Qwen expects a specific format for describing images
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen-VL-Chat", trust_remote_code=True)
query = tokenizer.from_list_format([
{'image': 'https://hips.hearstapps.com/ghk.h-cdn.co/assets/17/30/pembroke-welsh-corgi.jpg'},
{'text': 'What is this?'},
])
# Chat with the model using the dog adapter
response, history = model.chat(query=query, adapters=["dog"])
#> Here is a picture of a Corgi
# Cache an adapter on the server that was trained on cat images
resp = model.cache_adapter(name="cat", hf_repo="Anima-ai/cat_lora")
print(resp)
query = tokenizer.from_list_format([
{'image': 'https://www.catster.com/wp-content/uploads/2023/11/Brown-tabby-cat-that-curls-up-outdoors_viper-zero_Shutterstock-800x533.jpg'},
{'text': 'What is this?'},
])
# Chat with the same model using the new cat adapter
response, history = model.chat(query=query, adapters=["cat"])
#> Here is a picture of a tabby cat
Roadmap
- Tenancy
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
frequency_ai-0.1.8.tar.gz
(42.9 kB
view details)
Built Distribution
File details
Details for the file frequency_ai-0.1.8.tar.gz
.
File metadata
- Download URL: frequency_ai-0.1.8.tar.gz
- Upload date:
- Size: 42.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.5.1 CPython/3.10.1 Darwin/22.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 53c2abb09ba2e2bfc586532d52e63820f9969dffec5b4e30d8f4e9a9956bb523 |
|
MD5 | 141fa75f706095794e8da0c170672ae8 |
|
BLAKE2b-256 | 254dae6bd7d28e4ce094ba9a46653d0e03781608a45d91182685f614817a04b9 |
File details
Details for the file frequency_ai-0.1.8-py3-none-any.whl
.
File metadata
- Download URL: frequency_ai-0.1.8-py3-none-any.whl
- Upload date:
- Size: 55.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.5.1 CPython/3.10.1 Darwin/22.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 51b26bc9621fe67d03ba4ce7b6cd37f3d9a7fc4c4a6d2ee56c95ee2313cff51d |
|
MD5 | 000c05bd6987cfca4be13eca3c183629 |
|
BLAKE2b-256 | 7ee10322fccfbe449ecde268f9c0816684b896dd1cda9dd39a195aba493c7f21 |