Fine-tuning toolkit for the `deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B` model using QLoRA on macOS systems with limited RAM (~8GB). Includes conversion to GGUF format for usage with Ollama and LM Studio.

These details have not been verified by PyPI

Project links

Project description

Mimo LLM Project - Fine-tuning and GGUF Export

This project provides the necessary scripts and instructions to fine-tune the deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B model using QLoRA on a macOS system with limited RAM (~8GB), convert it to the GGUF format, and make it ready for use with Ollama and LM Studio.

Objective

Base Model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B from Hugging Face.
Fine-tuning Method: QLoRA for efficient adaptation on low-resource hardware.
Dataset: yahma/alpaca-cleaned (public text dataset).
RAM Optimization: Tuned for ~8GB RAM.
Output Format: GGUF (quantized 4-bit or 8-bit).
Final Model Name: Mimo
Attribution: "Créé par ABDESSEMED Mohamed Redha"
Compatibility: Ollama and LM Studio.

Setup Instructions

Create a Python virtual environment (recommended):

python3 -m venv venv
source venv/bin/activate

Install dependencies:
```
pip install -r requirements.txt
```
Note: Ensure you have the correct PyTorch version installed for your macOS system (CPU or Metal/MPS for Apple Silicon). Refer to the official PyTorch website for installation instructions. Or ```pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu

For Apple Silicon (MPS) : ```pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu

Fine-tuning (QLoRA)

The train_qlora.py script handles the fine-tuning process. It loads the base model, applies 4-bit quantization and QLoRA, and trains on the specified dataset.

To start training:
```
python train_qlora.py
```
Output: The fine-tuned model adapters will be saved in the outputs/mimo-qlora directory.
RAM Optimization: The script is configured with per_device_train_batch_size=1 and gradient_accumulation_steps=8 to manage memory usage. max_steps is set to 100 for a quick example; adjust as needed for longer training. gradient_checkpointing=True is also enabled for further memory savings.

Conversion to GGUF

The export_to_gguf.py script first merges the QLoRA adapters into the base model and saves it in Hugging Face format. It then provides instructions on how to convert this merged model into the GGUF format using the llama.cpp conversion tools.

Run the export script:
```
python export_to_gguf.py
```
This will save the merged Hugging Face model in gguf_model/merged_hf_model/ and print instructions for the GGUF conversion.
Convert to GGUF using llama.cpp: Follow these steps after running export_to_gguf.py:
- Ensure you have llama-cpp-python installed:
```
pip install llama-cpp-python
```
- Navigate to your llama.cpp directory (you might need to clone it from GitHub if you don't have it).
- Run the convert.py script from llama.cpp, pointing it to your saved Hugging Face model directory and specifying the desired quantization type.
Example command:
```
# Assuming you are in the llama.cpp directory and your merged model is at /Users/mohamed/Downloads/mac_ai_project/gguf_model/merged_hf_model
# And you want to quantize to 4-bit (q4_0)
python convert.py /Users/mohamed/Downloads/mac_ai_project/gguf_model/merged_hf_model --outfile /Users/mohamed/Downloads/mac_ai_project/gguf_model/Mimo.gguf --outtype q4_0
```
- You can choose different quantization types like q8_0 for 8-bit, f16 for float16, etc. q4_0 is a good balance for 4-bit.

Output: The final GGUF model will be saved as gguf_model/Mimo.gguf.

Usage

Ollama

Create a Modelfile: Create a file named Modelfile (no extension) in the same directory as your Mimo.gguf file with the following content:

FROM ./Mimo.gguf

TEMPLATE """{{ .System }}
{{- if .Prompt }}
USER: {{ .Prompt }}
ASSISTANT: {{ .Response }}
{{- end }}"""

PARAMETER stop "USER:"
PARAMETER stop "ASSISTANT:"
PARAMETER temperature 0.7
PARAMETER top_k 40
PARAMETER top_p 0.9
PARAMETER num_ctx 2048
PARAMETER repeat_penalty 1.1

Adjust num_ctx and other parameters as needed.

Import into Ollama: Navigate to the directory containing Mimo.gguf and your Modelfile in your terminal, then run:
```
ollama create mimo -f ./Modelfile
```
You can then interact with the model using ollama run mimo.

LM Studio

Open LM Studio.
Go to the "Local Server" tab or the "AI Models" tab.
Click the folder icon to browse for models.
Navigate to the gguf_model/ directory and select Mimo.gguf.
The model should load, and you can start chatting.

Attribution

This model, Mimo, was created by ABDESSEMED Mohamed Redha.

Modèle Mimo — Créé par ABDESSEMED Mohamed Redha

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

0.1.2

Sep 30, 2025

0.1.1

Sep 30, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Mimo-1B-0.1.2.tar.gz (4.4 kB view details)

Uploaded Sep 30, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

Mimo_1B-0.1.2-py3-none-any.whl (4.1 kB view details)

Uploaded Sep 30, 2025 Python 3

File details

Details for the file Mimo-1B-0.1.2.tar.gz.

File metadata

Download URL: Mimo-1B-0.1.2.tar.gz
Upload date: Sep 30, 2025
Size: 4.4 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.8

File hashes

Hashes for Mimo-1B-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`a9832d7af112abce095145109688aed741e56042b45781163048afff8dd5f730`
MD5	`73f0a438b3f92ac3d28899571e9f5b0f`
BLAKE2b-256	`acf2045234ef524ec71916979509617784196354964dfe23150d072c836e4325`

See more details on using hashes here.

File details

Details for the file Mimo_1B-0.1.2-py3-none-any.whl.

File metadata

Download URL: Mimo_1B-0.1.2-py3-none-any.whl
Upload date: Sep 30, 2025
Size: 4.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.8

File hashes

Hashes for Mimo_1B-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`79ad5846f4483620d07e6c4b0968ba4fdeb3064219bb62e453504463aaadc2f9`
MD5	`6b65a7fd5333bf192b327565e85cd28c`
BLAKE2b-256	`d9124de4c1344f0941b463844828f209a1d333fd127f129c19c65c98b82e9111`

See more details on using hashes here.

Mimo-1B 0.1.2

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Mimo LLM Project - Fine-tuning and GGUF Export

Objective

Setup Instructions

Fine-tuning (QLoRA)

Conversion to GGUF

Usage

Ollama

LM Studio

Attribution

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes