unfat

Extract datasets from models and train slimmer LoRAs on them

These details have not been verified by PyPI

Project description

unfat

Easily extract prompt/completion datasets from models and generate Axolotl configs to auto-distill smaller, slimmer LoRAs from the original models.

Example

from unfat.datasets import hub_prompts, HubSplit, Dataset, Prompts
from unfat.extract import Extractor, ClientOpts
from unfat.lora import LoraSettings
import os

output_dir = "output"
extractor = Extractor(
    # Extract from Qwen2.5-Coder-32B-Instruct
    teacher="hf:Qwen/Qwen2.5-Coder-32B-Instruct",
    # Make up to 10 concurrent requests at a time
    max_concurrent=10,
    output_dir=output_dir,
    # Use glhf.chat for the API
    client_opts=ClientOpts(
        base_url="https://glhf.chat/api/openai/v1",
        api_key=os.environ["GLHF_API_KEY"],
    ),
    # Pull the prompts from a coding dataset
    dataset=Dataset(
        train=[
            hub_prompts(
                name="perlthoughts/coding-prompts-small",
                text_field="instruction",
                split=HubSplit(name="train"),
            ),
        ],
    ),
)

# Runs the coding prompts through Qwen2.5-32B-Instruct and saves them to the
# output dir
extractor.run()

# Training hyperparameters
lora_settings = LoraSettings(
    lora_r=32,
    lora_alpha=16,
    lora_dropout=0.01,
    num_epochs=2,
    learning_rate=4e-4,
    warmup_steps=10,
)
# Save the Axolotl config to train a LoRA for Llama-3.1-70B-Instruct
axolotl_config = lora_settings.llama_70b_axolotl(extractor.output_dataset())
axolotl_config.save(output_dir)

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.0.13

Mar 10, 2025

0.0.12

Mar 9, 2025

0.0.11

Mar 9, 2025

0.0.10

Mar 9, 2025

0.0.9

Mar 9, 2025

0.0.8

Mar 8, 2025

0.0.7

Mar 7, 2025

0.0.6

Mar 7, 2025

0.0.5

Mar 6, 2025

This version

0.0.4

Feb 9, 2025

0.0.3

Feb 9, 2025

0.0.2

Feb 6, 2025

0.0.1

Feb 6, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unfat-0.0.4.tar.gz (5.7 kB view details)

Uploaded Feb 9, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

unfat-0.0.4-py3-none-any.whl (7.3 kB view details)

Uploaded Feb 9, 2025 Python 3

File details

Details for the file unfat-0.0.4.tar.gz.

File metadata

Download URL: unfat-0.0.4.tar.gz
Upload date: Feb 9, 2025
Size: 5.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.0.1 CPython/3.13.1 Linux/6.12.10-200.fc41.x86_64

File hashes

Hashes for unfat-0.0.4.tar.gz
Algorithm	Hash digest
SHA256	`4bcd9cbde2e314015cef33b9fd543ab63ee3172573e069a1fc52e6d73a2c68b2`
MD5	`abfc828a1af1be42138e8f2aefe91954`
BLAKE2b-256	`b675efb2994c27e30fbe84dfe64d9c91632a4793ff4c584d8a2e48fa3e9049c3`

See more details on using hashes here.

File details

Details for the file unfat-0.0.4-py3-none-any.whl.

File metadata

Download URL: unfat-0.0.4-py3-none-any.whl
Upload date: Feb 9, 2025
Size: 7.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.0.1 CPython/3.13.1 Linux/6.12.10-200.fc41.x86_64

File hashes

Hashes for unfat-0.0.4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`7ee62205ef890a2212a316e275d4148efc08d816350874eee825da4665d75ce7`
MD5	`b5eeb42f1dfdf6bcb24c4ce306659586`
BLAKE2b-256	`92b06ff2a510931a80dd3b17c6163d0a8023cf7b463dd61c8bda4876df48e9fd`

See more details on using hashes here.

unfat 0.0.4

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Example

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes