Extract datasets from models and train slimmer LoRAs on them
Project description
Easily extract prompt/completion datasets from models and generate Axolotl configs to auto-distill smaller, slimmer LoRAs from the original models.
Example
from unfat.datasets import hub_prompts, HubSplit, Dataset, Prompts
from unfat.extract import Extractor, ClientOpts
from unfat.lora import LoraSettings
import os
output_dir = "output"
extractor = Extractor(
# Extract from Qwen2.5-Coder-32B-Instruct
teacher="hf:Qwen/Qwen2.5-Coder-32B-Instruct",
# Make up to 10 concurrent requests at a time
max_concurrent=10,
output_dir=output_dir,
# Use glhf.chat for the API
client_opts=ClientOpts(
base_url="https://glhf.chat/api/openai/v1",
api_key=os.environ["GLHF_API_KEY"],
),
# Pull the prompts from a coding dataset
dataset=Dataset(
train=[
hub_prompts(
name="perlthoughts/coding-prompts-small",
text_field="instruction",
split=HubSplit(name="train"),
),
],
),
)
# Runs the coding prompts through Qwen2.5-32B-Instruct and saves them to the
# output dir
extractor.run()
# Training hyperparameters
lora_settings = LoraSettings(
lora_r=32,
lora_alpha=16,
lora_dropout=0.01,
num_epochs=2,
learning_rate=4e-4,
warmup_steps=10,
)
# Save the Axolotl config to train a LoRA for Llama-3.1-70B-Instruct
axolotl_config = lora_settings.llama_70b_axolotl(extractor.output_dataset())
axolotl_config.save(output_dir)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
unfat-0.0.2.tar.gz
(5.3 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
unfat-0.0.2-py3-none-any.whl
(6.9 kB
view details)
File details
Details for the file unfat-0.0.2.tar.gz.
File metadata
- Download URL: unfat-0.0.2.tar.gz
- Upload date:
- Size: 5.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.0.1 CPython/3.13.1 Linux/6.12.10-200.fc41.x86_64
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
43a8fb4a84ec72dbd9b418b539d301e8a7cec50f3f8f22e70914d8c1f156ec4c
|
|
| MD5 |
68d325656c4d862c478d2f772537166c
|
|
| BLAKE2b-256 |
acffc9c3a49b4943e1da5a7b7685b4cbc20b85e15891286f4f478816890ce591
|
File details
Details for the file unfat-0.0.2-py3-none-any.whl.
File metadata
- Download URL: unfat-0.0.2-py3-none-any.whl
- Upload date:
- Size: 6.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.0.1 CPython/3.13.1 Linux/6.12.10-200.fc41.x86_64
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a4fa3656e24ad13dda61169fa1f6176eadc2844ccf86ae3d2c72937d929ac285
|
|
| MD5 |
abbff711e5d2ddd8e9a5112a60437df0
|
|
| BLAKE2b-256 |
266d26afe3dbe69226a6b4232633ab7c84b598d534d072f7782d36ef4192ecae
|