Extract datasets from models and train slimmer LoRAs on them
Project description
Easily extract prompt/completion datasets from models and generate Axolotl configs to auto-distill smaller, slimmer LoRAs from the original models.
Example
from unfat.datasets import hub_prompts, HubSplit, Dataset, Prompts
from unfat.extract import Extractor, ClientOpts
from unfat.lora import LoraSettings
import os
output_dir = "output"
extractor = Extractor(
# Extract from Qwen2.5-Coder-32B-Instruct
teacher="hf:Qwen/Qwen2.5-Coder-32B-Instruct",
# Make up to 10 concurrent requests at a time
max_concurrent=10,
output_dir=output_dir,
# Use glhf.chat for the API
client_opts=ClientOpts(
base_url="https://glhf.chat/api/openai/v1",
api_key=os.environ["GLHF_API_KEY"],
),
# Pull the prompts from a coding dataset
dataset=Dataset(
train=[
hub_prompts(
name="perlthoughts/coding-prompts-small",
text_field="instruction",
split=HubSplit(name="train"),
),
],
),
)
# Runs the coding prompts through Qwen2.5-32B-Instruct and saves them to the
# output dir
extractor.run()
# Training hyperparameters
lora_settings = LoraSettings(
lora_r=32,
lora_alpha=16,
lora_dropout=0.01,
num_epochs=2,
learning_rate=4e-4,
warmup_steps=10,
)
# Save the Axolotl config to train a LoRA for Llama-3.1-70B-Instruct
axolotl_config = lora_settings.llama_70b_axolotl(extractor.output_dataset())
axolotl_config.save(output_dir)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
unfat-0.0.3.tar.gz
(5.3 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
unfat-0.0.3-py3-none-any.whl
(6.9 kB
view details)
File details
Details for the file unfat-0.0.3.tar.gz.
File metadata
- Download URL: unfat-0.0.3.tar.gz
- Upload date:
- Size: 5.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.0.1 CPython/3.13.1 Linux/6.12.10-200.fc41.x86_64
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ca8b880e295b30a34606dc1fe92212f5807326d8123bd7d8f96d55e121aa9057
|
|
| MD5 |
25ebca6a247911c6b267023ea4c874a5
|
|
| BLAKE2b-256 |
ce7b7a70fabe13516cb2d8f464b2bfa12b3be7c4ceb7c56f6728b4a0ee5ea617
|
File details
Details for the file unfat-0.0.3-py3-none-any.whl.
File metadata
- Download URL: unfat-0.0.3-py3-none-any.whl
- Upload date:
- Size: 6.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.0.1 CPython/3.13.1 Linux/6.12.10-200.fc41.x86_64
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
78d45ecbf1205fce6aea1c0bcc756fe1f19ef09b1dff30c01f5e7093fdd8350a
|
|
| MD5 |
d7ab93cf9d16095eb91cae15a391decf
|
|
| BLAKE2b-256 |
c281a4b344e9453a48d1cb80c38ea8d436da0481babe2e6149c493483413cee6
|