Package for fine-tuning, running and exporting Large Language Models with Unsloth.

These details have not been verified by PyPI

Project links

Project description

Sinapsis Unsloth

Templates for optimized LLM fine-tuning and deployment.

🐍 Installation • 🚀 Features • 📚 Usage example • 📙 Documentation • 🔍 License

The sinapsis-unsloth module provides ready-to-use templates for continued pretraining, instruct fine-tuning, conversational fine-tuning, inference and model export to GGUF, merged, and quantized formats using Unsloth.

🐍 Installation

Install using your package manager of choice. We recommend uv for faster installations.

Standard Installation (Pre-built Wheels)

This method automatically installs optimized pre-built wheels for flash-attn, skipping the long compilation times.

Supported: Linux (x86_64), Python 3.10 - 3.12, CUDA 12.x

Using uv:

# Install Flash Attention (Example for Python 3.10 + CUDA 12.4)
uv pip install https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/download/v0.5.4/flash_attn-2.8.3+cu124torch2.9-cp310-cp310-linux_x86_64.whl

# Install Sinapsis Unsloth
uv pip install sinapsis-unsloth[all] --extra-index-url https://pypi.sinapsis.tech

Using raw pip:

# Install Flash Attention (Example for Python 3.10 + CUDA 12.4)
pip install https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/download/v0.5.4/flash_attn-2.8.3+cu124torch2.9-cp310-cp310-linux_x86_64.whl

# Install Sinapsis Unsloth
pip install sinapsis-unsloth[all] --extra-index-url https://pypi.sinapsis.tech

Manual Build (From Source)

Use this if you are on an unsupported platform (e.g., Windows, non-standard CUDA versions) or need to compile flash-attn yourself.

Using uv:

export MAX_JOBS=4 # Adjust based on your RAM specs
uv pip install torch packaging ninja setuptools
uv pip install sinapsis-unsloth[all] --extra-index-url https://pypi.sinapsis.tech

Using raw pip:

export MAX_JOBS=4 # Adjust based on your RAM specs
pip install torch packaging ninja setuptools
pip install sinapsis-unsloth[all] --extra-index-url https://pypi.sinapsis.tech

🚀 Features

The templates support all capabilities from Unsloth for efficient LLM fine-tuning, inference, and model export, including:

Optimized Training: 4-bit (QLoRA), 8-bit, 16-bit, and full precision fine-tuning
Hardware Efficiency: Reduced GPU memory usage with Unsloth's optimized kernels
Flexible Export: GGUF quantization and Merged model export options for deployment
High-Performance Inference: Native 4-bit inference with dynamic chat templating and streaming

Templates Supported

Training

UnslothPretrainer: Designed for continued pretraining (domain adaptation) on raw text. Features efficient sequence packing and specific learning rate controls for embeddings.
UnslothInstructTrainer: Optimized for instruction fine-tuning. Processes standard instruction-input-response triplets with configurable preambles and dynamic formatting (handling optional inputs gracefully).
UnslothConversationTrainer: Specialized for conversational AI fine-tuning with chat datasets. Supports both ShareGPT and Alpaca formats (with auto-conversion), handles dynamic chat templating, and supports response-only loss masking.

Inference

UnslothInferenceCompletion: Raw text completion template for base models or custom formatting needs.
UnslothInferenceInstruct: Streamlined inference for instruction-tuned models using standard task preambles.
UnslothInferenceConversational: Manages multi-turn chat history, system prompts, and dynamic chat template application for conversational models.
UnslothInferenceReasoning: Extends conversational inference to support Chain-of-Thought (CoT) models (e.g., DeepSeek-R1), handling the extraction of internal reasoning traces.

Export

UnslothExportGGUF: Exports models to GGUF format for efficient CPU/Edge inference (e.g., Llama.cpp). Supports configurable quantization methods (q4_k_m, q8_0, etc.).
UnslothExportMerged: Merges LoRA adapters back into the base model (16-bit or 4-bit) for deployment on vLLM, or pushes directly to the Hugging Face Hub.

🌍 General Attributes

The model_args attribute controls how Unsloth loads and configures the model.

model_name (str, required): Model ID or local path.
cache_dir (str): Cache directory. Default: SINAPSIS_CACHE_DIR.
max_seq_length (int): Maximum sequence length. Default: 2048.
dtype ("auto" | "bfloat16" | "float16"): Weight precision. Default: "auto".
load_in_4bit (bool): Enable 4-bit quantization. Default: True.
load_in_8bit (bool): Enable 8-bit quantization. Default: False.
load_in_16bit (bool): Load weights in FP16. Default: False.
full_finetuning (bool): Enable full fine-tuning. Default: False.
device_map (str): Device placement strategy. Default: "sequential".
use_gradient_checkpointing (str): Checkpointing mode. Default: "unsloth".
fast_inference (bool): Enable optimized inference. Default: False.
gpu_memory_utilization (float): Max GPU memory fraction. Default: 0.5.
random_state (int): Random seed. Default: 3407.
max_lora_rank (int): Maximum LoRA rank. Default: 64.

⚙️ Fine-tuning Attributes

These attributes apply to all fine-tuning templates:

lora_args (UnslothLoraArgs)
- LoRA configuration (rank, alpha, dropout, target modules, gradient checkpointing).
trainer_args (UnslothTrainerArgs)
- Trainer options (text field, packing, sequence length, loss type).
training_args (UnslothTrainingArgs)
- Hugging Face training parameters (batch size, learning rate, logging, saving).
train_dataset (DatasetConfig)
- Dataset configuration, including:
  - loader_args (source and loading parameters)
  - map_args (preprocessing)
  - shuffle (shuffling behavior)
  - pre_tokenize (tokenization options)
resume_from_checkpoint (bool)
- Resume training from the last checkpoint.
save_path (str)
- Directory where fine-tuned adapters or weights will be saved.

🧠 Inference Attributes

These attributes configure Unsloth-based inference templates.

rag_context_key (str | None)
- Metadata key used to retrieve optional RAG context.
generate_args (UnslothGenerateArgs)
- Token generation settings (sampling, length, stopping, temperature, penalties).
stream (bool)
- Enables token-by-token console streaming during generation.

📦 Export Attributes

These attributes configure Unsloth-based model export templates.

export_args (UnslothExportBaseArgs)
- Export parameters such as save path, shard size, and memory limits.
push_to_hub (bool)
- Enables pushing the exported model to the Hugging Face Hub.

[!TIP] Use CLI command sinapsis info --all-template-names to show a list with all the available Template names installed with Sinapsis Unsloth.

[!TIP] Use CLI command sinapsis info --example-template-config TEMPLATE_NAME to produce an example Agent config for the Template specified in TEMPLATE_NAME.

For example, for UnslothPretrainer use sinapsis info --example-template-config UnslothPretrainer to produce the following example config:

agent:
  name: my_test_agent
templates:
- template_name: InputTemplate
  class_name: InputTemplate
  attributes: {}
- template_name: UnslothPretrainer
  class_name: UnslothPretrainer
  template_input: InputTemplate
  attributes:
    model_args:
      model_name: '`replace_me:<class ''str''>`'
      cache_dir: /path/to/sinapsis/.cache
      max_seq_length: 2048
      dtype: auto
      load_in_4bit: true
      load_in_8bit: false
      load_in_16bit: false
      full_finetuning: false
      device_map: sequential
      use_gradient_checkpointing: unsloth
      fast_inference: false
      gpu_memory_utilization: 0.5
      random_state: 3407
      max_lora_rank: 64
    lora_args:
      r: 16
      target_modules:
      - q_proj
      - k_proj
      - v_proj
      - o_proj
      - gate_proj
      - up_proj
      - down_proj
      lora_alpha: 16
      lora_dropout: 0.0
      bias: none
      use_gradient_checkpointing: unsloth
      random_state: 3407
      use_rslora: false
      modules_to_save: null
      loftq_config: '`replace_me:<class ''dict''>`'
    trainer_args:
      dataset_text_field: text
      dataset_num_proc: null
      max_length: 1024
      packing: false
      packing_strategy: bfd
      eval_packing: false
      completion_only_loss: null
      assistant_only_loss: false
      loss_type: nll
      activation_offloading: false
    training_args:
      output_dir: trainer_output
      overwrite_output_dir: false
      eval_strategy: 'no'
      eval_steps: null
      per_device_train_batch_size: 8
      per_device_eval_batch_size: 8
      gradient_accumulation_steps: 1
      eval_accumulation_steps: null
      torch_empty_cache_steps: null
      learning_rate: 5.0e-05
      weight_decay: 0.0
      max_grad_norm: 1.0
      num_train_epochs: 3.0
      max_steps: null
      lr_scheduler_type: linear
      warmup_ratio: 0.0
      warmup_steps: null
      logging_strategy: steps
      logging_first_step: false
      logging_steps: 500
      save_strategy: steps
      save_steps: 500
      save_only_model: false
      use_cpu: false
      seed: 3407
      data_seed: null
      bf16: false
      fp16: false
      dataloader_drop_last: false
      dataloader_num_workers: 0
      remove_unused_columns: true
      load_best_model_at_end: false
      metric_for_best_model: loss
      optim: adamw_torch
      report_to: none
      push_to_hub: false
      hub_model_id: null
      embedding_learning_rate: 5.0e-05
    train_dataset:
      loader_args:
        path: '`replace_me:<class ''str''>`'
        name: null
        data_dir: null
        data_files: null
        split: null
        cache_dir: /path/to/sinapsis/.cache
        features: null
        num_proc: null
      map_args:
        desc: null
        batched: false
        batch_size: 1000
        num_proc: 0
        keep_in_memory: false
        load_from_cache_file: true
      shuffle:
        enabled: false
        args:
          seed: null
          keep_in_memory: false
          load_from_cache_file: true
      pre_tokenize:
        enabled: false
        args:
          add_special_tokens: true
          padding: do_not_pad
          truncation: do_not_truncate
          max_length: null
          stride: 0
          is_split_into_words: false
          padding_side: null
          verbose: true
        map_args:
          desc: null
          batched: false
          batch_size: 1000
          num_proc: 0
          keep_in_memory: false
          load_from_cache_file: true
    resume_from_checkpoint: false
    save_path: '`replace_me:<class ''str''>`'

📚 Usage example

The following agent exports the unsloth/DeepSeek-R1-Distill-Qwen-1.5B model in GGUF format with no quantization at the artifacts/DeepSeek-R1-Distill-Qwen-1.5B-gguf path.

Config

agent:
  name: model_export_agent
  description: Agent to handle model export and conversion workflows

templates:
- template_name: InputTemplate
  class_name: InputTemplate
  attributes: {}
- template_name: UnslothExportGGUF
  class_name: UnslothExportGGUF
  template_input: InputTemplate
  attributes:
    model_args:
      model_name: unsloth/DeepSeek-R1-Distill-Qwen-1.5B
      dtype: "bfloat16"
      load_in_4bit: false
      gpu_memory_utilization: 1
    export_args:
      save_path : artifacts/DeepSeek-R1-Distill-Qwen-1.5B-gguf
      maximum_memory_usage: 1
      quantization_method: not_quantized
    push_to_hub: false

You can see additional fine-tuning agent configurations at the configs directory.

📙 Documentation

Documentation for this and other sinapsis packages is available on the sinapsis website

Tutorials for different projects within sinapsis are available at sinapsis tutorials page

🔍 License

This project is licensed under the AGPLv3 license, which encourages open collaboration and sharing. For more details, please refer to the LICENSE file.

For commercial use, please refer to our official Sinapsis website for information on obtaining a commercial license.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.0

Feb 26, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sinapsis_unsloth-0.1.0.tar.gz (44.1 kB view details)

Uploaded Feb 26, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

sinapsis_unsloth-0.1.0-py3-none-any.whl (55.2 kB view details)

Uploaded Feb 26, 2026 Python 3

File details

Details for the file sinapsis_unsloth-0.1.0.tar.gz.

File metadata

Download URL: sinapsis_unsloth-0.1.0.tar.gz
Upload date: Feb 26, 2026
Size: 44.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.5.16

File hashes

Hashes for sinapsis_unsloth-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`043d9713bc374bd9cfb6b80e33ae9d8bb2099311271250318a4cc3376ce0e8ea`
MD5	`b864d30080f9a56318a1d0fc53e0bb76`
BLAKE2b-256	`c9f3dcd379d4313556269dbed122208edc0e15c814550735e5a0bb1032be6a9b`

See more details on using hashes here.

File details

Details for the file sinapsis_unsloth-0.1.0-py3-none-any.whl.

File metadata

Download URL: sinapsis_unsloth-0.1.0-py3-none-any.whl
Upload date: Feb 26, 2026
Size: 55.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.5.16

File hashes

Hashes for sinapsis_unsloth-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`a256762790f5a8b568bee4f63b140217269a9ee14ecc70d5147955b1c1674c8d`
MD5	`7f8cb143ff4db8e0f3bd0bfe8e53dcdd`
BLAKE2b-256	`2ed006f6a5a026c3b931be4a1a5a90ebbdb00a85c61f144d9243e0f46fe7b5d8`

See more details on using hashes here.

sinapsis-unsloth 0.1.0

Navigation

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Project description

Sinapsis Unsloth

Templates for optimized LLM fine-tuning and deployment.

🐍 Installation

🚀 Features

Templates Supported

Training

Inference

Export

📚 Usage example

📙 Documentation

🔍 License

Project details

Verified details

Maintainers

Meta

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes