Fine-tune LLMs (LoRA/QLoRA) and image classifiers locally — browser UI + optional TUI

These details have not been verified by PyPI

Project description

llmtune

Fine-tune AI models (LLMs and image classifiers) on your own computer — no cloud, no subscription, no data leaving your machine.

pip install llmtune-local
llmtune run

The PyPI package is llmtune-local; the command you run is llmtune.

What is this?

llmtune is a command-line tool that lets you take any open-source AI language model and train it further on your own data. This process is called fine-tuning.

Think of it like this: a language model (like Llama, Mistral, or Qwen) comes pre-trained on billions of words from the internet. It knows a lot about everything. But maybe you want it to:

Answer questions specifically about your company's product
Write in your brand's tone of voice
Speak like a customer support agent trained on your FAQ
Generate code in your specific codebase style
Behave like a domain expert in your field

Instead of paying OpenAI or Anthropic, you take an open-source model and teach it yourself, using your own data. That's fine-tuning.

llmtune makes that process as simple as pointing to a file.

Why "local"?

Everything runs 100% on your computer. That means:

Your data never leaves your machine. No API calls, no cloud uploads, no third party sees it.
No ongoing cost. You pay nothing per query, per token, or per training run.
You own the result. The fine-tuned model is a file on your disk. Use it however you want.
Works offline. Once the base model is downloaded, no internet is needed.

Who is this for?

Developers who want to customize an AI model without deep ML knowledge
Small teams who can't afford enterprise AI costs but have their own data
Researchers experimenting with model behavior
Hobbyists who want to run their own AI locally
Companies with sensitive data that must stay on-premises

How it works — plain English

You pick a model. You choose from a list of popular open-source models (TinyLlama, Llama 3, Mistral, Qwen, etc.) or point to one already on your disk.
You provide a dataset. This is a file — JSONL, JSON, CSV, or plain text — containing the examples you want the model to learn from. Each example is a piece of text: a question+answer pair, a document, a conversation, whatever you want the model to get good at.
You set some numbers. A few settings control how long and how intensely training runs. Beginners can leave everything at defaults. Advanced users get full control over every knob.
Training runs. The tool loads the model into your GPU/CPU memory and runs training in the background. You watch a live progress screen showing each step, the loss (a measure of how wrong the model still is — lower is better), and elapsed time.
You get a file. When training finishes, a small adapter file is saved to your disk. This adapter is not a full copy of the model — it's a compact set of modifications (typically 10–100 MB) that, when applied on top of the base model, makes it behave according to your data.

What is LoRA / QLoRA?

Training a full language model requires hundreds of gigabytes of GPU memory and weeks of compute time. That's not feasible on a laptop.

LoRA (Low-Rank Adaptation) is a technique that sidesteps this. Instead of adjusting every single number inside the model (billions of parameters), LoRA inserts tiny extra "training layers" into specific parts of the model. Only those tiny layers are trained. The rest of the model stays frozen.

The result: you can fine-tune a 7-billion-parameter model using just 4–8 GB of RAM, in minutes to hours instead of weeks.

QLoRA takes it further — it first compresses ("quantizes") the base model to use 4-bit numbers instead of 16-bit, cutting memory usage roughly in half again. This is how you fine-tune large models on consumer hardware.

On Apple Silicon (M1/M2/M3 Macs), llmtune automatically uses regular LoRA in float16 — QLoRA's quantization library doesn't support Apple chips, so that step is skipped automatically.

The interface

llmtune has a full terminal UI (TUI) — it's not just text scrolling in a shell. It has proper screens, inputs, buttons, and navigation, all rendered inside your terminal.

Screen 1 — Login When you first run llmtune, it asks you to log in via your browser. This is a one-time step. The login is used only to let the developer know how many people are using the tool. Nothing about your models, datasets, or training is ever sent anywhere.

Screen 2 — Model selection Two options:

Paste a local folder path if you already have a HuggingFace model downloaded
Pick from a list of popular models (they will be downloaded from HuggingFace the first time, then cached on disk forever)

Supported model sources: HuggingFace download or a local HuggingFace folder. GGUF format (.gguf files, Ollama blobs) is not supported for training — GGUF is an inference-only format. If you have an Ollama model installed, the model selection screen will show its HuggingFace equivalent so you can use that instead.

Screen 3 — Dataset Enter the path to your dataset file and choose the format. An "Advanced" section lets you configure the text field name and sequence length if needed.

Screen 4 — Training settings Three core settings are shown immediately: epochs, batch size, and output folder. Below them, three collapsible sections let advanced users configure LoRA parameters, quantization mode, and the learning rate scheduler.

Screen 5 — Training Live view of training. Shows current step, loss value, elapsed time, a progress bar, and a scrollable log of everything the trainer outputs. You can stop training early at any time.

What is a "loss"?

During training, the model repeatedly tries to predict the next word in your dataset. The loss is a number that measures how often it gets it wrong. It starts high (bad) and should decrease over time as the model learns. A falling loss curve means training is working. A flat or rising loss means something is off (wrong learning rate, too few examples, etc.).

What gets saved?

After training, two things are saved to your chosen output folder:

Adapter weights — the LoRA "diff" on top of the base model. A few files, usually under 100 MB.
Tokenizer config — the vocabulary settings needed to use the model correctly.

To use the fine-tuned model later, you load the base model and apply the adapter on top. No need to store a full copy of the base model for each fine-tune — adapters are tiny.

Authentication

llmtune signs you in with Auth0 using the OAuth 2.0 Device Authorization Flow. Here's exactly what happens and why:

On launch, the local llmtune server asks Auth0 for a device + user code and opens Auth0's hosted sign-in page in your browser.
You sign in there (Auth0 hosts the page — llmtune never sees your password).
The local server polls Auth0 until you've approved, then receives and verifies an Auth0 ID token (RS256, checked against Auth0's JWKS).
The token is stored in your system's secure keyring (macOS Keychain, etc.). A one-line record of which account signed in is kept locally at ~/.llmtune/logins.jsonl.
None of your training data, model choices, or results are ever sent anywhere. Sign-in only proves identity.

The sole purpose of auth is for the developer to know who's using the tool. The tool is free, runs entirely on your machine, and has no usage limits tied to your account.

Hardware requirements

Scenario	Minimum RAM	Notes
1B parameter model (e.g. TinyLlama)	6 GB	Works on most laptops
3B parameter model	8 GB	M1 MacBook Air with 8 GB is borderline
7B parameter model	14–16 GB	Needs 16 GB unified memory (M1 Pro / M2 etc.)
13B+ parameter model	24 GB+	Desktop GPU recommended

Apple Silicon Macs use "unified memory" — GPU and CPU share the same pool, so a 16 GB M1 Pro can handle a 7B model that would need a dedicated 16 GB NVIDIA card on a Windows PC.

Dataset format

Your dataset is a file on your computer. Supported formats:

JSONL (recommended) — one JSON object per line:

{"text": "Question: What is the capital of France? Answer: Paris."}
{"text": "Question: How do I reverse a list in Python? Answer: my_list[::-1]"}

Instruction/Response format (auto-detected):

{"instruction": "Summarize this article", "context": "...", "response": "..."}

JSON — an array of objects:

[{"text": "..."}, {"text": "..."}]

CSV — with a column called text (configurable).

Plain text — one training sample per line.

A minimum of ~50–100 examples is recommended. More is better. Quality matters more than quantity.

Technology stack

Layer	Technology	Why
Terminal UI	Textual	Modern Python TUI framework, looks great
Fine-tuning	HuggingFace PEFT + TRL	Industry standard LoRA implementation
Model loading	Transformers	Supports every major open-source model
Auth	Auth0 (Device Authorization Flow)	Hosted sign-in, identity, ID tokens
Server	FastAPI	Serves the UI + brokers the device flow
Frontend	React + Vite + TypeScript	Browser/native-window UI
Token storage	OS keyring (macOS Keychain, etc.)	Secure, no plaintext secrets on disk
Packaging	PyPI via hatchling	Standard Python package distribution

Installing from PyPI

pip install llmtune-local
llmtune run

First run opens an Auth0 sign-in page in your browser once. After that, just run llmtune run and you're at the model selection screen.

Using the fine-tuned model

After training completes, load your adapter in Python:

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model = AutoModelForCausalLM.from_pretrained("TinyLlama/TinyLlama-1.1B-Chat-v1.0")
model = PeftModel.from_pretrained(base_model, "/path/to/llmtune-output")
tokenizer = AutoTokenizer.from_pretrained("/path/to/llmtune-output")

inputs = tokenizer("What is the capital of France?", return_tensors="pt")
output = model.generate(**inputs, max_new_tokens=50)
print(tokenizer.decode(output[0], skip_special_tokens=True))

License

MIT — free to use, modify, and distribute.

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.2

Jun 24, 2026

0.1.1

Jun 24, 2026

0.1.0

Jun 24, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llmtune_local-0.1.2.tar.gz (114.8 kB view details)

Uploaded Jun 24, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llmtune_local-0.1.2-py3-none-any.whl (135.8 kB view details)

Uploaded Jun 24, 2026 Python 3

File details

Details for the file llmtune_local-0.1.2.tar.gz.

File metadata

Download URL: llmtune_local-0.1.2.tar.gz
Upload date: Jun 24, 2026
Size: 114.8 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for llmtune_local-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`98071ce66fa31ab45c44ff3278d77a2a05f99adeb18596c7ab2fd86763b37d37`
MD5	`877ade4508c51b98887345f7e15d334c`
BLAKE2b-256	`a8315f4789c546c2f29803e61d0e5b27358a56506843f7b8e0c2540fdc0425e8`

See more details on using hashes here.

File details

Details for the file llmtune_local-0.1.2-py3-none-any.whl.

File metadata

Download URL: llmtune_local-0.1.2-py3-none-any.whl
Upload date: Jun 24, 2026
Size: 135.8 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.4

File hashes

Hashes for llmtune_local-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2c77cd419cc17ec1104f21f42f72c8264855f50d703b030cf915b87fa2996190`
MD5	`929bcd83156a8b2bc1c34e546ce1b986`
BLAKE2b-256	`b7ff95e8ef901d68ac44976195f836a24e2c54828dfa5b00dffb79a247c5c6a5`

See more details on using hashes here.

llmtune-local 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

llmtune

What is this?

Why "local"?

Who is this for?

How it works — plain English

What is LoRA / QLoRA?

The interface

What is a "loss"?

What gets saved?

Authentication

Hardware requirements

Dataset format

Technology stack

Installing from PyPI

Using the fine-tuned model

License

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes