Skip to main content

Add your description here

Project description

Mimic-Kit

Python 3.13+ License: MIT

中文文档 | English

Mimic-Kit is a black-box knowledge distillation scaffolding that helps you distill capabilities from powerful teacher models into smaller, more efficient student models.

Features

  • 🤖 Teacher Model Generation: Use OpenAI API (or compatible APIs) to generate high-quality training data
  • 🚀 Efficient Training: Built on ms-swift for streamlined model training
  • 🧩 Flexible Tuning: Support both LoRA (parameter-efficient) and full fine-tuning
  • Performance Optimized: Integrated with DeepSpeed and Liger Kernel for accelerated training
  • 💾 Smart Caching: Automatically cache API responses to save costs
  • 📊 Multiple Data Formats: Support both chat and text completion formats

Quick Start

1. Installation

Option 1: Install from PyPI (Recommended)

pip install mimic-kit

Option 2: Install from source

# Clone the repository
git clone https://github.com/ECNU-innoSpark/EduRmDistill
cd mimic-kit

# Install dependencies using UV
uv sync

# Or with dev dependencies
uv sync --group dev

2. Initialize Configuration

uv run mimic init

This creates a config.yaml template. Edit it to configure your teacher model, student model, and training parameters.

3. Prepare Your Data

Create a JSONL file with your prompts. Supports two formats:

Chat Format:

{"messages": [{"role": "user", "content": "Explain Python decorators"}]}
{"messages": [{"role": "user", "content": "How to reverse a linked list?"}]}

Text Completion Format:

{"text": "Python decorators are a powerful feature..."}
{"text": "To reverse a linked list, you need to..."}

4. Generate Training Data

uv run mimic generate

This sends your prompts to the teacher model and saves the generated responses as training data.

5. Train Your Student Model

uv run mimic train

The student model will be fine-tuned on the generated data using the configured method (LoRA or full fine-tuning).

Configuration

See config.yaml for all available options. Key sections:

  • data: Input/output paths, system prompts, templates
  • teacher: API provider, model, generation parameters
  • student: Base model, tuning method, hyperparameters
  • training: Batch size, learning rate, saving strategy

Project Structure

mimic-kit/
├── mimic/
│   ├── cli.py              # CLI entry point
│   ├── config.py           # Configuration models (Pydantic v2)
│   ├── generator/          # Teacher model data generation
│   └── trainer/            # Student model training (ms-swift)
├── data/                   # Training data directory
├── config.yaml             # Your configuration file
└── output/                 # Model outputs and checkpoints

Requirements

  • Python 3.13+
  • CUDA-capable GPU (for training)
  • OpenAI API key (or compatible API)

Development

# Run tests
uv run pytest

# Run single test
uv run pytest tests/test_cli.py::test_function

# Format code
uv run ruff format .

# Check linting
uv run ruff check .

License

MIT License - see LICENSE file for details.

Acknowledgments

  • Built with ms-swift for model training
  • Uses Pydantic for configuration validation
  • Powered by Click for CLI

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mimic_kit-0.1.1.tar.gz (14.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mimic_kit-0.1.1-py3-none-any.whl (15.8 kB view details)

Uploaded Python 3

File details

Details for the file mimic_kit-0.1.1.tar.gz.

File metadata

  • Download URL: mimic_kit-0.1.1.tar.gz
  • Upload date:
  • Size: 14.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Arch Linux","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for mimic_kit-0.1.1.tar.gz
Algorithm Hash digest
SHA256 c4c0ed22ac84b4a033d56482637d4ce57ed7ec26f0aaa8b2e4fe119c34b97666
MD5 ae7eb2bab858d887b11127788f15b280
BLAKE2b-256 d360ec0793a8a0bcf3359ab73bb1f46f5dc3ae6e9fdf5215e61686a99fd18a1d

See more details on using hashes here.

File details

Details for the file mimic_kit-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: mimic_kit-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 15.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.10.9 {"installer":{"name":"uv","version":"0.10.9","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Arch Linux","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for mimic_kit-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7bfc9d6ed81669862e715e36bcab89687cefe963f56966c559a49ac4760ed8b9
MD5 734b54f6e4b0fd021d312383a1debfda
BLAKE2b-256 213cfb9adc280c5dab035924f136eeb1996d801c6ef3511a0b8eb620439de9c7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page