Add your description here
Project description
Mimic-Kit
中文文档 | English
Mimic-Kit is a black-box knowledge distillation scaffolding that helps you distill capabilities from powerful teacher models into smaller, more efficient student models.
Features
- 🤖 Teacher Model Generation: Use OpenAI API (or compatible APIs) to generate high-quality training data
- 🚀 Efficient Training: Built on ms-swift for streamlined model training
- 🧩 Flexible Tuning: Support both LoRA (parameter-efficient) and full fine-tuning
- ⚡ Performance Optimized: Integrated with DeepSpeed and Liger Kernel for accelerated training
- 💾 Smart Caching: Automatically cache API responses to save costs
- 📊 Multiple Data Formats: Support both chat and text completion formats
Quick Start
1. Installation
Option 1: Install from PyPI (Recommended)
pip install mimic-kit
Option 2: Install from source
# Clone the repository
git clone https://github.com/ECNU-innoSpark/EduRmDistill
cd mimic-kit
# Install dependencies using UV
uv sync
# Or with dev dependencies
uv sync --group dev
2. Initialize Configuration
uv run mimic init
This creates a config.yaml template. Edit it to configure your teacher model, student model, and training parameters.
3. Prepare Your Data
Create a JSONL file with your prompts. Supports two formats:
Chat Format:
{"messages": [{"role": "user", "content": "Explain Python decorators"}]}
{"messages": [{"role": "user", "content": "How to reverse a linked list?"}]}
Text Completion Format:
{"text": "Python decorators are a powerful feature..."}
{"text": "To reverse a linked list, you need to..."}
4. Generate Training Data
uv run mimic generate
This sends your prompts to the teacher model and saves the generated responses as training data.
5. Train Your Student Model
uv run mimic train
The student model will be fine-tuned on the generated data using the configured method (LoRA or full fine-tuning).
Configuration
See config.yaml for all available options. Key sections:
data: Input/output paths, system prompts, templatesteacher: API provider, model, generation parametersstudent: Base model, tuning method, hyperparameterstraining: Batch size, learning rate, saving strategy
Project Structure
mimic-kit/
├── mimic/
│ ├── cli.py # CLI entry point
│ ├── config.py # Configuration models (Pydantic v2)
│ ├── generator/ # Teacher model data generation
│ └── trainer/ # Student model training (ms-swift)
├── data/ # Training data directory
├── config.yaml # Your configuration file
└── output/ # Model outputs and checkpoints
Requirements
- Python 3.13+
- CUDA-capable GPU (for training)
- OpenAI API key (or compatible API)
Development
# Run tests
uv run pytest
# Run single test
uv run pytest tests/test_cli.py::test_function
# Format code
uv run ruff format .
# Check linting
uv run ruff check .
License
MIT License - see LICENSE file for details.
Acknowledgments
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mimic_kit-0.1.1.tar.gz.
File metadata
- Download URL: mimic_kit-0.1.1.tar.gz
- Upload date:
- Size: 14.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Arch Linux","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c4c0ed22ac84b4a033d56482637d4ce57ed7ec26f0aaa8b2e4fe119c34b97666
|
|
| MD5 |
ae7eb2bab858d887b11127788f15b280
|
|
| BLAKE2b-256 |
d360ec0793a8a0bcf3359ab73bb1f46f5dc3ae6e9fdf5215e61686a99fd18a1d
|
File details
Details for the file mimic_kit-0.1.1-py3-none-any.whl.
File metadata
- Download URL: mimic_kit-0.1.1-py3-none-any.whl
- Upload date:
- Size: 15.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Arch Linux","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7bfc9d6ed81669862e715e36bcab89687cefe963f56966c559a49ac4760ed8b9
|
|
| MD5 |
734b54f6e4b0fd021d312383a1debfda
|
|
| BLAKE2b-256 |
213cfb9adc280c5dab035924f136eeb1996d801c6ef3511a0b8eb620439de9c7
|