Extract datasets from models and train slimmer LoRAs on them
Project description
Easily extract prompt/completion datasets from models and generate Axolotl configs to auto-distill smaller, slimmer LoRAs from the original models.
Example
from unfat.datasets import hub_prompts, HubSplit, Dataset, Prompts
from unfat.extract import Extractor, ClientOpts
from unfat.lora import LoraSettings
import os
output_dir = "output"
extractor = Extractor(
# Extract from Qwen2.5-Coder-32B-Instruct
teacher="hf:Qwen/Qwen2.5-Coder-32B-Instruct",
# Make up to 10 concurrent requests at a time
max_concurrent=10,
output_dir=output_dir,
# Use glhf.chat for the API
client_opts=ClientOpts(
base_url="https://glhf.chat/api/openai/v1",
api_key=os.environ["GLHF_API_KEY"],
),
# Pull the prompts from a coding dataset
dataset=Dataset(
train=[
hub_prompts(
name="perlthoughts/coding-prompts-small",
text_field="instruction",
split=HubSplit(name="train"),
),
],
),
)
# Runs the coding prompts through Qwen2.5-32B-Instruct and saves them to the
# output dir
extractor.run()
# Training hyperparameters
lora_settings = LoraSettings(
lora_r=32,
lora_alpha=16,
lora_dropout=0.01,
num_epochs=2,
learning_rate=4e-4,
warmup_steps=10,
)
# Save the Axolotl config to train a LoRA for Llama-3.1-70B-Instruct
axolotl_config = lora_settings.llama_70b_axolotl(extractor.output_dataset())
axolotl_config.save(output_dir)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
unfat-0.0.1.tar.gz
(5.3 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
unfat-0.0.1-py3-none-any.whl
(6.9 kB
view details)
File details
Details for the file unfat-0.0.1.tar.gz.
File metadata
- Download URL: unfat-0.0.1.tar.gz
- Upload date:
- Size: 5.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.0.1 CPython/3.13.1 Linux/6.12.10-200.fc41.x86_64
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f7470939bedbb89e0c6a0e35622b9a3d78aa04cb32d256739d6518bac50a8117
|
|
| MD5 |
e0ce2f41a5796e5acd32d976764744dc
|
|
| BLAKE2b-256 |
37c868bc37126c0386c38fa0a1d187189b196808a7d35230507766b17b8c454d
|
File details
Details for the file unfat-0.0.1-py3-none-any.whl.
File metadata
- Download URL: unfat-0.0.1-py3-none-any.whl
- Upload date:
- Size: 6.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.0.1 CPython/3.13.1 Linux/6.12.10-200.fc41.x86_64
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
985455279a8a198af64541919c986234cd64a5bbf5749552637d0703cad5cc81
|
|
| MD5 |
c15adda102f694fadb770f6b019f7184
|
|
| BLAKE2b-256 |
f24a65ade1dc3e6625b04a18f13b1ecc3a465beb03335e89e6d27510332c771c
|