Package for fine-tuning, running and exporting Large Language Models with Unsloth.
Project description
Sinapsis Unsloth
Templates for optimized LLM fine-tuning and deployment.
🐍 Installation • 🚀 Features • 📚 Usage example • 📙 Documentation • 🔍 License
The sinapsis-unsloth module provides ready-to-use templates for continued pretraining, instruct fine-tuning, conversational fine-tuning, inference and model export to GGUF, merged, and quantized formats using Unsloth.
🐍 Installation
Install using your package manager of choice. We recommend uv for faster installations.
Standard Installation (Pre-built Wheels)
This method automatically installs optimized pre-built wheels for flash-attn, skipping the long compilation times.
Supported: Linux (x86_64), Python 3.10 - 3.12, CUDA 12.x
Using uv:
# Install Flash Attention (Example for Python 3.10 + CUDA 12.4)
uv pip install https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/download/v0.5.4/flash_attn-2.8.3+cu124torch2.9-cp310-cp310-linux_x86_64.whl
# Install Sinapsis Unsloth
uv pip install sinapsis-unsloth[all] --extra-index-url https://pypi.sinapsis.tech
Using raw pip:
# Install Flash Attention (Example for Python 3.10 + CUDA 12.4)
pip install https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/download/v0.5.4/flash_attn-2.8.3+cu124torch2.9-cp310-cp310-linux_x86_64.whl
# Install Sinapsis Unsloth
pip install sinapsis-unsloth[all] --extra-index-url https://pypi.sinapsis.tech
Manual Build (From Source)
Use this if you are on an unsupported platform (e.g., Windows, non-standard CUDA versions) or need to compile flash-attn yourself.
Using uv:
export MAX_JOBS=4 # Adjust based on your RAM specs
uv pip install torch packaging ninja setuptools
uv pip install sinapsis-unsloth[all] --extra-index-url https://pypi.sinapsis.tech
Using raw pip:
export MAX_JOBS=4 # Adjust based on your RAM specs
pip install torch packaging ninja setuptools
pip install sinapsis-unsloth[all] --extra-index-url https://pypi.sinapsis.tech
🚀 Features
The templates support all capabilities from Unsloth for efficient LLM fine-tuning, inference, and model export, including:
- Optimized Training: 4-bit (QLoRA), 8-bit, 16-bit, and full precision fine-tuning
- Hardware Efficiency: Reduced GPU memory usage with Unsloth's optimized kernels
- Flexible Export: GGUF quantization and Merged model export options for deployment
- High-Performance Inference: Native 4-bit inference with dynamic chat templating and streaming
Templates Supported
Training
-
UnslothPretrainer: Designed for continued pretraining (domain adaptation) on raw text. Features efficient sequence packing and specific learning rate controls for embeddings.
-
UnslothInstructTrainer: Optimized for instruction fine-tuning. Processes standard instruction-input-response triplets with configurable preambles and dynamic formatting (handling optional inputs gracefully).
-
UnslothConversationTrainer: Specialized for conversational AI fine-tuning with chat datasets. Supports both ShareGPT and Alpaca formats (with auto-conversion), handles dynamic chat templating, and supports response-only loss masking.
Inference
-
UnslothInferenceCompletion: Raw text completion template for base models or custom formatting needs.
-
UnslothInferenceInstruct: Streamlined inference for instruction-tuned models using standard task preambles.
-
UnslothInferenceConversational: Manages multi-turn chat history, system prompts, and dynamic chat template application for conversational models.
-
UnslothInferenceReasoning: Extends conversational inference to support Chain-of-Thought (CoT) models (e.g., DeepSeek-R1), handling the extraction of internal reasoning traces.
Export
-
UnslothExportGGUF: Exports models to GGUF format for efficient CPU/Edge inference (e.g., Llama.cpp). Supports configurable quantization methods (q4_k_m, q8_0, etc.).
-
UnslothExportMerged: Merges LoRA adapters back into the base model (16-bit or 4-bit) for deployment on vLLM, or pushes directly to the Hugging Face Hub.
🌍 General Attributes
The model_args attribute controls how Unsloth loads and configures the model.
model_name(str, required): Model ID or local path.cache_dir(str): Cache directory. Default:SINAPSIS_CACHE_DIR.max_seq_length(int): Maximum sequence length. Default:2048.dtype("auto" | "bfloat16" | "float16"): Weight precision. Default:"auto".load_in_4bit(bool): Enable 4-bit quantization. Default:True.load_in_8bit(bool): Enable 8-bit quantization. Default:False.load_in_16bit(bool): Load weights in FP16. Default:False.full_finetuning(bool): Enable full fine-tuning. Default:False.device_map(str): Device placement strategy. Default:"sequential".use_gradient_checkpointing(str): Checkpointing mode. Default:"unsloth".fast_inference(bool): Enable optimized inference. Default:False.gpu_memory_utilization(float): Max GPU memory fraction. Default:0.5.random_state(int): Random seed. Default:3407.max_lora_rank(int): Maximum LoRA rank. Default:64.
⚙️ Fine-tuning Attributes
These attributes apply to all fine-tuning templates:
-
lora_args(UnslothLoraArgs)- LoRA configuration (rank, alpha, dropout, target modules, gradient checkpointing).
-
trainer_args(UnslothTrainerArgs)- Trainer options (text field, packing, sequence length, loss type).
-
training_args(UnslothTrainingArgs)- Hugging Face training parameters (batch size, learning rate, logging, saving).
-
train_dataset(DatasetConfig)- Dataset configuration, including:
loader_args(source and loading parameters)map_args(preprocessing)shuffle(shuffling behavior)pre_tokenize(tokenization options)
- Dataset configuration, including:
-
resume_from_checkpoint(bool)- Resume training from the last checkpoint.
-
save_path(str)- Directory where fine-tuned adapters or weights will be saved.
🧠 Inference Attributes
These attributes configure Unsloth-based inference templates.
-
rag_context_key(str | None)- Metadata key used to retrieve optional RAG context.
-
generate_args(UnslothGenerateArgs)- Token generation settings (sampling, length, stopping, temperature, penalties).
-
stream(bool)- Enables token-by-token console streaming during generation.
📦 Export Attributes
These attributes configure Unsloth-based model export templates.
-
export_args(UnslothExportBaseArgs)- Export parameters such as save path, shard size, and memory limits.
-
push_to_hub(bool)- Enables pushing the exported model to the Hugging Face Hub.
[!TIP] Use CLI command
sinapsis info --all-template-namesto show a list with all the available Template names installed with Sinapsis Unsloth.
[!TIP] Use CLI command
sinapsis info --example-template-config TEMPLATE_NAMEto produce an example Agent config for the Template specified in TEMPLATE_NAME.
For example, for UnslothPretrainer use sinapsis info --example-template-config UnslothPretrainer to produce the following example config:
agent:
name: my_test_agent
templates:
- template_name: InputTemplate
class_name: InputTemplate
attributes: {}
- template_name: UnslothPretrainer
class_name: UnslothPretrainer
template_input: InputTemplate
attributes:
model_args:
model_name: '`replace_me:<class ''str''>`'
cache_dir: /path/to/sinapsis/.cache
max_seq_length: 2048
dtype: auto
load_in_4bit: true
load_in_8bit: false
load_in_16bit: false
full_finetuning: false
device_map: sequential
use_gradient_checkpointing: unsloth
fast_inference: false
gpu_memory_utilization: 0.5
random_state: 3407
max_lora_rank: 64
lora_args:
r: 16
target_modules:
- q_proj
- k_proj
- v_proj
- o_proj
- gate_proj
- up_proj
- down_proj
lora_alpha: 16
lora_dropout: 0.0
bias: none
use_gradient_checkpointing: unsloth
random_state: 3407
use_rslora: false
modules_to_save: null
loftq_config: '`replace_me:<class ''dict''>`'
trainer_args:
dataset_text_field: text
dataset_num_proc: null
max_length: 1024
packing: false
packing_strategy: bfd
eval_packing: false
completion_only_loss: null
assistant_only_loss: false
loss_type: nll
activation_offloading: false
training_args:
output_dir: trainer_output
overwrite_output_dir: false
eval_strategy: 'no'
eval_steps: null
per_device_train_batch_size: 8
per_device_eval_batch_size: 8
gradient_accumulation_steps: 1
eval_accumulation_steps: null
torch_empty_cache_steps: null
learning_rate: 5.0e-05
weight_decay: 0.0
max_grad_norm: 1.0
num_train_epochs: 3.0
max_steps: null
lr_scheduler_type: linear
warmup_ratio: 0.0
warmup_steps: null
logging_strategy: steps
logging_first_step: false
logging_steps: 500
save_strategy: steps
save_steps: 500
save_only_model: false
use_cpu: false
seed: 3407
data_seed: null
bf16: false
fp16: false
dataloader_drop_last: false
dataloader_num_workers: 0
remove_unused_columns: true
load_best_model_at_end: false
metric_for_best_model: loss
optim: adamw_torch
report_to: none
push_to_hub: false
hub_model_id: null
embedding_learning_rate: 5.0e-05
train_dataset:
loader_args:
path: '`replace_me:<class ''str''>`'
name: null
data_dir: null
data_files: null
split: null
cache_dir: /path/to/sinapsis/.cache
features: null
num_proc: null
map_args:
desc: null
batched: false
batch_size: 1000
num_proc: 0
keep_in_memory: false
load_from_cache_file: true
shuffle:
enabled: false
args:
seed: null
keep_in_memory: false
load_from_cache_file: true
pre_tokenize:
enabled: false
args:
add_special_tokens: true
padding: do_not_pad
truncation: do_not_truncate
max_length: null
stride: 0
is_split_into_words: false
padding_side: null
verbose: true
map_args:
desc: null
batched: false
batch_size: 1000
num_proc: 0
keep_in_memory: false
load_from_cache_file: true
resume_from_checkpoint: false
save_path: '`replace_me:<class ''str''>`'
📚 Usage example
The following agent exports the unsloth/DeepSeek-R1-Distill-Qwen-1.5B model in GGUF format with no quantization at the artifacts/DeepSeek-R1-Distill-Qwen-1.5B-gguf path.
Config
agent:
name: model_export_agent
description: Agent to handle model export and conversion workflows
templates:
- template_name: InputTemplate
class_name: InputTemplate
attributes: {}
- template_name: UnslothExportGGUF
class_name: UnslothExportGGUF
template_input: InputTemplate
attributes:
model_args:
model_name: unsloth/DeepSeek-R1-Distill-Qwen-1.5B
dtype: "bfloat16"
load_in_4bit: false
gpu_memory_utilization: 1
export_args:
save_path : artifacts/DeepSeek-R1-Distill-Qwen-1.5B-gguf
maximum_memory_usage: 1
quantization_method: not_quantized
push_to_hub: false
You can see additional fine-tuning agent configurations at the configs directory.
📙 Documentation
Documentation for this and other sinapsis packages is available on the sinapsis website
Tutorials for different projects within sinapsis are available at sinapsis tutorials page
🔍 License
This project is licensed under the AGPLv3 license, which encourages open collaboration and sharing. For more details, please refer to the LICENSE file.
For commercial use, please refer to our official Sinapsis website for information on obtaining a commercial license.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sinapsis_unsloth-0.1.0.tar.gz.
File metadata
- Download URL: sinapsis_unsloth-0.1.0.tar.gz
- Upload date:
- Size: 44.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.5.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
043d9713bc374bd9cfb6b80e33ae9d8bb2099311271250318a4cc3376ce0e8ea
|
|
| MD5 |
b864d30080f9a56318a1d0fc53e0bb76
|
|
| BLAKE2b-256 |
c9f3dcd379d4313556269dbed122208edc0e15c814550735e5a0bb1032be6a9b
|
File details
Details for the file sinapsis_unsloth-0.1.0-py3-none-any.whl.
File metadata
- Download URL: sinapsis_unsloth-0.1.0-py3-none-any.whl
- Upload date:
- Size: 55.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.5.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a256762790f5a8b568bee4f63b140217269a9ee14ecc70d5147955b1c1674c8d
|
|
| MD5 |
7f8cb143ff4db8e0f3bd0bfe8e53dcdd
|
|
| BLAKE2b-256 |
2ed006f6a5a026c3b931be4a1a5a90ebbdb00a85c61f144d9243e0f46fe7b5d8
|