Skip to main content

Unified MoE+LoRA experimentation kit

Project description

EulerForge

๐Ÿ‡ฐ๐Ÿ‡ท ํ•œ๊ตญ์–ด: README.ko.md

EulerForge is a unified fine-tuning toolkit for HuggingFace-compatible LLMs. It brings together LoRA, Mixture-of-LoRAs, MoE-expert LoRA, and native-MoE fine-tuning under a single YAML-driven CLI, with built-in support for SFT, DPO, ORPO, RM, and PPO.

One preset, one command โ€” from a base model to a deployable checkpoint.


Features

  • Five training paths in one pipeline: SFT โ†’ DPO / ORPO โ†’ RM โ†’ PPO
  • Four injection strategies: dense_lora, mixture_lora, moe_expert_lora, native_moe_expert_lora
  • Broad backbone support: Qwen2/3, Llama 2/3, Gemma 3, Gemma 4 (dense + MoE), Mixtral
  • 4-bit / 8-bit quantized training (nf4 / int4 / int8) via bitsandbytes
  • Pipeline-friendly: checkpoints from one stage flow into the next (SFT โ†’ DPO auto-detects base model and LoRA config)
  • Phase scheduling: progressively unfreeze router โ†’ LoRA โ†’ base FFN
  • Preflight + MoE stability validation: catch config errors before a single GPU cycle
  • Integrated benchmarking with Ollama / OpenAI / local HF judge models
  • Hyperparameter search (grid / random / bayes) via Optuna
  • HF export: produce a standard HuggingFace directory that can be loaded with from_pretrained()
  • Internationalized CLI: --lang ko/en/zh/ja/es

Installation

pip install -e .

Requirements: Python โ‰ฅ 3.9, PyTorch โ‰ฅ 2.1, Transformers โ‰ฅ 5.5.

Optional extras:

pip install -e .[hpo]  # Optuna for grid/bayesian search
pip install -e .[tb]   # TensorBoard logging

Quickstart

1. Train

# Simplest: dense LoRA SFT on Qwen3.5-0.8B with raw JSONL data
eulerforge train \
    --preset configs/presets/qwen3.5_0.8b_dense_lora_sft.yml \
    --set data.format=raw \
    --set data.task=sft \
    --set data.path=data/sft_10k_raw.jsonl \
    --set data.max_length=512

2. Evaluate

# Benchmark against a baseline + judge model
eulerforge bench --preset configs/bench/sft_with_judge.yml \
    --target-output-dir outputs/run_YYYYMMDD_HHMMSS

3. Export

# Merge LoRA into a standard HF directory
eulerforge export-hf \
    --checkpoint outputs/run_YYYYMMDD_HHMMSS \
    --output ./exported

4. Load in Python

from eulerforge import load_model

result = load_model("outputs/run_YYYYMMDD_HHMMSS")
# result.model, result.tokenizer, result.metadata

CLI Commands

Command What it does
eulerforge train Fine-tune with LoRA / MoE strategies + phase scheduling
eulerforge convert Convert arbitrary JSONL โ†’ EulerForge raw JSONL (map / recipe)
eulerforge preprocess Tokenize raw JSONL โ†’ processed JSONL (cached)
eulerforge bench Run target / baseline / judge inference benchmark
eulerforge export-hf Export trained checkpoint as a standard HF model directory
eulerforge grid Hyperparameter search (grid / random / bayesian)

Global options apply to all commands:

eulerforge --lang en train --preset ...   # English log output
eulerforge --lang ko train --preset ...   # Korean (default)

๐Ÿ“– Full CLI reference with every flag, YAML spec, training types, phase scheduling, load precision, and output structure: docs/cli_en.md


Included Presets

A curated set of ready-to-run training presets lives under configs/presets/:

Category Example preset Strategy Training
Qwen 3.5 small qwen3.5_0.8b_dense_lora_sft.yml Dense LoRA SFT
Qwen 3.5 small qwen3.5_0.8b_mixture_lora_sft.yml Mixture-of-LoRAs SFT
Qwen 3.5 small qwen3.5_0.8b_moe_expert_lora_sft.yml MoE Expert LoRA SFT
Qwen 3.5 small qwen3.5_0.8b_moe_expert_lora_dpo.yml MoE Expert LoRA DPO
Qwen 3.5 small qwen3.5_0.8b_dense_lora_orpo.yml Dense LoRA ORPO
Qwen 3.5 small qwen3.5_0.8b_dense_lora_rm.yml Dense LoRA RM
Qwen 3.5 small qwen3.5_0.8b_dense_lora_ppo.yml Dense LoRA PPO
Qwen 3.5 qwen3.5_4b_dense_lora_sft.yml Dense LoRA SFT
Llama 3.2 llama3_1b_dense_lora_sft.yml Dense LoRA SFT
Llama 3.2 llama3_1b_moe_expert_lora_sft_handoff.yml MoE Expert LoRA + Handoff SFT
Gemma 3 gemma3_1b_mixture_lora_dpo.yml Mixture-of-LoRAs DPO
Gemma 3 gemma3_4b_moe_expert_lora_orpo_handoff.yml MoE Expert LoRA + Handoff ORPO
Gemma 4 (dense) gemma4_e2b_dense_lora_sft.yml, gemma4_e4b_*.yml All strategies SFT / DPO
Gemma 4 (MoE) gemma4_26b_a4b_native_expert_lora_sft.yml Native MoE Expert LoRA SFT
Mixtral mixtral_native_expert_lora_sft.yml Native MoE Expert LoRA SFT
TinyLlama tinyllama_1.1b_dense_lora_dpo.yml Dense LoRA DPO

Domain-specific preset groups:

  • configs/presets/math/ โ€” math SFT + DPO pipeline for Llama 3.2-1B
  • configs/presets/reasoning/ โ€” 2-stage Mixture-of-LoRAs for CoT reasoning

Benchmark presets live under configs/bench/.


Tutorials

EulerForge ships with a numbered tutorial series, available in both English and Korean.

Browse: docs/tutorials/en/ โ€” docs/tutorials/ko/

Recommended reading order:

  1. Getting Started โ€” install, strategy selection, CLI quickstart
  2. 00. Data Preprocessing
  3. 01. Dense LoRA
  4. 02. Mixture-of-LoRAs
  5. 03. MoE Expert LoRA
  6. 04. Native MoE Expert LoRA
  7. 05. DPO Training
  8. 06. ORPO Training
  9. 07. Reward Model Training
  10. 08. PPO (RLHF) Training
  11. 09. MoE Stability & Validation
  12. 10. Metrics Monitoring
  13. 11. Inference Benchmark
  14. 12. Grid / Random / Bayesian Search
  15. 13. LLaMA Fine-Tuning
  16. 14. LoRA Handoff
  17. 15. Loading Models
  18. 16. HuggingFace Export
  19. 17. Scratch Pretraining
  20. 18. Training Pipeline (SFT โ†’ PPO)
  21. 19. Data Collection for Labs
  22. 20. Lab: Math SFT + DPO Pipeline
  23. 21. Lab: Chain-of-Thought Reasoning Model
  24. 22. Lab: Korean Finance Copilot
  25. 23. Lab: Full MoE Pipeline (SFT โ†’ DPO โ†’ RM โ†’ PPO)

Architecture at a Glance

            YAML preset
                โ”‚
                โ–ผ
   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
   โ”‚   Config resolve + validate   โ”‚  โ† preflight + MoE stability checks
   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
              โ–ผ
   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
   โ”‚   Base model load (bnb 4/8)  โ”‚  โ† HuggingFace AutoModel
   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
              โ–ผ
   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
   โ”‚   Injection (WHERE / WHAT)   โ”‚  โ† dense_lora / mixture_lora / MoE expert LoRA
   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
              โ–ผ
   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
   โ”‚   Phase scheduler (WHEN)     โ”‚  โ† router โ†’ LoRA โ†’ base FFN
   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
              โ–ผ
   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
   โ”‚   Training loop              โ”‚  โ† AdamW + cosine LR + AMP + aux loss
   โ”‚    SFT / DPO / ORPO / RM / PPOโ”‚
   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
              โ–ผ
   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
   โ”‚   Checkpoint + HF export     โ”‚  โ† final / best / latest
   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Key design points:

  • Backbone adapters locate layers, FFN and attention modules for each model family, so the same injection / training code works across Qwen, Llama, Gemma, Mixtral, โ€ฆ
  • Phase scheduling lets you stage who is trainable over time (router warmup โ†’ LoRA โ†’ full FFN), making large-model fine-tuning stable and reproducible.
  • Pipeline checkpoints automatically detect a prior EulerForge run and re-use its base model + LoRA config, so SFT โ†’ DPO โ†’ RM โ†’ PPO is one sequence of commands.

Testing

# Run the public test suite (no GPU required)
pytest tests/ -x

This covers unit tests, CLI surface tests, validators, backbone adapters, loss functions, schedulers, i18n, and the plugin system. Test fixtures and synthetic data are included.


Contributing

  1. Run the test suite: pytest tests/ -x.
  2. Open a PR with a clear summary of what changed and why.
  3. Please keep CLI output i18n-clean (no hard-coded user-facing strings).

License

Apache License 2.0. See LICENSE.


Contact

EulerWa Inc. ๐ŸŒ eulerwa.com ๐Ÿ“ง tech@eulerwa.com

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

eulerforge-0.1.0.tar.gz (331.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

eulerforge-0.1.0-py3-none-any.whl (216.9 kB view details)

Uploaded Python 3

File details

Details for the file eulerforge-0.1.0.tar.gz.

File metadata

  • Download URL: eulerforge-0.1.0.tar.gz
  • Upload date:
  • Size: 331.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for eulerforge-0.1.0.tar.gz
Algorithm Hash digest
SHA256 07a652b102f1fbd34a97b7abecd86c545989dfdf482cb61a033efcb7c2e01fd4
MD5 709bcf32918aa0985d62255426366689
BLAKE2b-256 9aa9bef474c49c8efbcd10273648a915d2f853abe6b873a96f53c6479bc18d56

See more details on using hashes here.

File details

Details for the file eulerforge-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: eulerforge-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 216.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for eulerforge-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6e7073710b233c663013b9f31b196c0df92c1db814a4f05fef7371b25c77a0fc
MD5 81d4d34fd75698423efd828e9e9a7ada
BLAKE2b-256 440802d9aee552fc94b250ed9b720943d12d77e69613d3738264dac72b0b4697

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page