YAML-driven modular LLM assembler with Hugging Face compatibility
Project description
EulerStack
A YAML-driven modular LLM assembler with Hugging Face compatibility.
๐ Language: English ยท ํ๊ตญ์ด
EulerStack lets you describe a transformer-family architecture as a YAML spec, validate it against a strict schema, estimate parameters, and compile it to either a JSON runtime config or a standard Hugging Face model directory that you can immediately plug into transformers, PEFT, vLLM, or any downstream training framework.
It is an architecture assembly tool, not a training framework. EulerStack stops at a clean, randomly-initialised, structurally-valid model. From there you bring your own data and your favourite trainer.
Why EulerStack?
Want to try DeepSeek-V3's MLA attention on top of your Llama baseline?
- Code path: fork
modeling_llama.py, rewriteLlamaAttention, patch the KV-cache, re-map the state-dict. ~200โ300 lines of diff. The intent โ "try MLA" โ is one line buried in hundreds. - EulerStack path:
- attention: { qkv_bias: false } + attention: { qkv_bias: false, latent_dim: 384 }
One line.
That ratio โ idea vs mechanical plumbing โ is the whole pitch:
- Changes are tiny. Swap attention for Mamba, add MoE to every 4th layer, enable 2-phase reasoning โ each is 1โ5 YAML lines, not a refactor.
- The diff is the design decision. Two months later, you still know what you changed and why.
modeling_custom.pydiffs lose that intent inside the plumbing. - You can discuss it like a blueprint. Reviewers read the spec, not spelunk through PyTorch. Architecture debates happen on a document, not in code comments.
- Lintable before any GPU. Parameter counts, head-dim sanity, KV-cache budgets โ all caught pre-training.
- Output is vanilla HuggingFace. Plugs into
transformers, PEFT, vLLM, etc. No lock-in, no custom runtime.
Installation
Requires Python 3.10+.
From PyPI (recommended):
pip install eulerstack
From source (for development or the latest main):
git clone https://github.com/<your-org>/eulerstack.git
cd eulerstack
pip install -e .
Either way, the eulerstack CLI is installed on your PATH.
Core runtime dependencies: torch >= 2.1, transformers >= 4.40, pyyaml, click.
Quickstart
The CLI speaks five languages (ko / en / zh / ja / es). The default is Korean; pass --lang en or set EULERSTACK_LANG=en for English.
# See the bundled presets
eulerstack --lang en presets list
# Validate a spec (schema check only)
eulerstack --lang en validate --preset configs/presets/llm_2b_simple.yml
# Validate with a full realism report (param estimates, sanity checks, warnings)
eulerstack --lang en validate --preset my_model.yml --report
# Explain what the spec describes in human-readable form
eulerstack --lang en explain --preset configs/presets/arch_beginner_llama.yml
# Compile to a runtime JSON config
eulerstack --lang en compile --preset my_model.yml --output compiled.json
# Compile to a Hugging Face model directory (config.json + model.safetensors)
eulerstack --lang en compile --preset my_model.yml --output-dir ./my_model_hf
The --output-dir form writes a directory that loads directly with transformers:
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("./my_model_hf")
tokenizer = AutoTokenizer.from_pretrained("gpt2") # per the spec's tokenizer_contract
Weights are randomly initialised. Training is explicitly out of scope โ see Where EulerStack Fits below.
What a Spec Looks Like
A minimal decoder-only model:
schema_version: 1
model:
name: "my-llm"
d_model: 2048
vocab_size: 32000
max_seq_len: 4096
n_heads: 16
mlp_ratio: 4
dtype: bfloat16
tokenizer_contract:
type: hf
pretrained: gpt2
embedding:
type: learned
positional: rope
rope_theta: 500000.0
tie_word_embeddings: true
layer_templates:
decoder:
mixer:
type: attention # or: mamba, retnet, hyena, linear_attention, ...
attention: {}
ffn:
type: gated_mlp # or: moe, mlp
activation: swiglu
norm:
type: rmsnorm
position: pre
layer_schedule:
- template: decoder
repeat: 24
head:
type: causal_lm
tie_weights: true
Hybrid and MoE models are expressed the same way โ you define multiple layer_templates and arrange them in layer_schedule. See the configs/presets/ directory for working examples, including attention-free models, MoE-every-N-layers, and mixed-mixer stacks.
Architecture
EulerStack is a five-layer pipeline; each layer has one job.
YAML spec (DSL)
โ
โผ
โโโโโโโโโโโโ validate โ schema v1, cross-field checks, realism warnings
โ Schema โ
โโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโ normalize_to_ir โ typed, canonical in-memory representation
โ IR โ
โโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโ compile_ir โ materialise layer list, param count, runtime config
โ Compiler โ
โโโโโโโโโโโโ
โ
โโโโบ JSON runtime config
โ
โโโโบ Hugging Face model directory (PreTrainedModel + safetensors)
A few details worth knowing:
- Schema v1 is versioned and strict. Unknown keys are errors (with one
exception: reserved prefixes
experimental.*/future.*/vendor.*.*are accepted as warnings so plugins and in-progress research can coexist). - Mixer types are pluggable: attention (with GQA / sliding-window / RoPE / ALiBi), Mamba / Mamba2, RetNet, Hyena, linear attention, and more. Adding a new mixer means implementing one block class and registering it โ no changes to the schema or compiler.
- FFN types include dense MLP, gated MLP (SwiGLU / GeGLU), and MoE (top-k routing, capacity factor, router z-loss).
- Outputs are vanilla Hugging Face. There is no EulerStack runtime โ the exported directory is indistinguishable from any
AutoModelForCausalLM.from_pretrained()target, so all the standard ecosystem tooling (PEFT, LoRA, bitsandbytes, accelerate, DeepSpeed, vLLM, SGLang, llama.cpp converters where applicable) just works.
Presets
configs/presets/ ships with 52 ready-to-compile specs, organised as a
three-tier progression from industry-standard canon to v1's new architecture features.
Tier 1 โ Validated industrial
Production-grade baselines. Training recipes are well-studied; failure modes are known.
arch_beginner_gpt2,arch_beginner_llamaโ classic Transformer and Llama-2/3 stylearch_intermediate_mistral,arch_intermediate_gemma2,arch_intermediate_qwen_longctxโ modern attention patternsllm_0p1b_{simple,mistral}โ Stage-1 / CPT warm-up (sovereign-foundation pilot)llm_*_simple,llm_*_mistralacross 0.8B / 2B / 4B / 16B
Tier 2 โ Recent / complex (hybrid, MoE, KV-compressed)
Modern research consensus running in production systems.
arch_advanced_{jamba, samba, retnet}โ hybrid and attention-free linesarch_advanced_mlaโ MLA (DeepSeek-V3 2024, runtime Core)arch_advanced_modโ Mixture-of-Depths (Raposo ICML 2024, runtime Component)arch_expert_*(9 presets, some speculative) โ MoE ร mixer ร depth explorationsarch_expert_*_mini(6 small-scale experts) โ ablation-ready for single-GPUllm_*_jamba,llm_*_moe,llm_*_mlaacross 0.1B / 0.8B / 2B / 4B / 16B (MoE skipped at 0.1B)
Tier 3 โ v1 experimental (new advanced architecture features at arch-scale)
Three arch_expert_* presets (~1.2โ1.4B) that each showcase one of the
advanced architecture features. Schema-complete; runtime partial โ the full
spec round-trips via config.v1_extensions.
| Preset | Feature | Research basis |
|---|---|---|
arch_expert_reasoning_r1 |
execution_modes + transition (2-phase think/answer) |
DeepSeek-R1 (2025), Quiet-STaR (NeurIPS 2024) |
arch_expert_titans_memory |
template.memory (parametric + test-time update) |
Titans (Google 2024-2025) |
arch_expert_dual_stream |
parallel: monoidal schedule (Mamba โฅ Attention) |
Jamba ร PaLM generalization |
Presets are starting points, not the ceiling. EulerStack can assemble models of essentially any size โ the schema has no size cap.
Where EulerStack Fits (End-to-End Pipeline)
EulerStack is deliberately a narrow tool: it produces a well-formed, randomly-initialised Hugging Face model. A realistic LLM pipeline looks like this, and EulerStack owns only the first box.
โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโ โโโโโโโโโโ
โ EulerStack โ -> โ Pretrain โ -> โ Post-training โ -> โ Evaluate โ -> โ Serve โ
โ (this tool) โ โ your choice โ โ SFT / DPO / โ โ your suite โ โ your โ
โ โ โ of trainer โ โ RLHF / etc. โ โ โ โ stack โ
โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโ โโโโโโโโโโ
^
YAML spec in,
HF model out
Because the output is a standard PreTrainedModel, you can pair EulerStack with any training stack you already trust:
- Pretraining / continued pretraining: Megatron-LM, NeMo, TorchTitan, Hugging Face Trainer, Composer, Levanter, GPT-NeoX.
- Fine-tuning (full / LoRA / QLoRA): PEFT, TRL, Axolotl, Unsloth, LLaMA-Factory.
- Alignment: TRL (DPO / PPO / KTO), OpenRLHF.
- Serving: vLLM, SGLang, TGI, TensorRT-LLM โ any engine that loads
transformerscheckpoints.
This scope separation is intentional. Training is a fast-moving space with strong, well-maintained tools; EulerStack does not try to re-implement any of them. What it does do is give you a stable, reviewable, reproducible starting point so that every downstream step operates on an architecture whose shape is explicit and auditable.
A typical workflow:
# 1. Design and validate an architecture
eulerstack --lang en validate --preset my_model.yml --report
# 2. Export a Hugging Face model directory (random weights)
eulerstack --lang en compile --preset my_model.yml --output-dir ./my_model_hf
# 3. Hand off to your training stack of choice, e.g. with transformers Trainer:
# model = AutoModelForCausalLM.from_pretrained("./my_model_hf")
# trainer = Trainer(model=model, train_dataset=..., ...)
# trainer.train()
Project Layout
eulerstack/
โโโ eulerstack/ # Python package
โ โโโ spec/ # Schema, validation, parameter estimation, reports
โ โโโ ir/ # Typed intermediate representation + normalizer
โ โโโ compiler/ # IR -> runtime config / HF model directory
โ โโโ components/ # Attention, Mamba, RetNet, Hyena, MoE, norms, ...
โ โโโ blocks/ # Layer templates composed from components
โ โโโ assembler/ # Layer-schedule materialisation
โ โโโ hf/ # Hugging Face export (config.json, safetensors)
โ โโโ cli/ # `eulerstack` command
โ โโโ i18n/ # 5-language CLI message catalog
โโโ configs/presets/ # 52 ready-to-compile YAML specs
โโโ examples/ # Runnable scripts (compile โ export โ load โ generate)
โโโ tests/ # Unit + smoke tests
โโโ pyproject.toml
Tutorials
Full, searchable online tutorials are published at:
๐ eulerwa.com/en/products/eulerstack/tutorials/
The offline copy under docs/tutorials/en/ mirrors the site and is the best place to start if you prefer to read locally. Key entry points:
- Tutorial 0 โ Where EulerStack Fits explains why EulerStack is an Architecture Description Language (ADL) for LLMs, not a training framework.
- Tutorial 2 โ Use Presets walks through the 53 shipped presets organised in three tiers.
- Tutorial 10 โ Paper โ YAML ports DeepSeek-V3 / Jamba / DeepSeek-R1 / Titans into YAML through a professor/student dialogue.
Examples
See examples/ for runnable scripts:
01_compile_and_export.pyโ compile a preset and save as an HF model directory.02_load_and_generate.pyโ load the exported model withtransformersand generate.03_architecture_evolution.pyโ compare several architectures side by side.
Testing
python -m pytest tests/ -v
The unit suite covers schema validation, IR normalisation, compilation, parameter estimation, report generation, and CLI behaviour for every bundled preset.
Contributing
Contributions are welcome. Please open an issue to discuss substantial changes (new mixer types, schema changes, new presets) before sending a PR. For small fixes or clarifications, a PR is fine on its own.
When adding a new component (e.g. a new mixer), the rough checklist is:
- Implement the block under
eulerstack/components/oreulerstack/blocks/. - Register it so the schema accepts it.
- Add a minimal preset in
configs/presets/that exercises it. - Add tests alongside the existing suite in
tests/.
License
Licensed under the Apache License, Version 2.0. See LICENSE for the full text.
Copyright ยฉ 2026 Eulerwa Inc.
Contact
Eulerwa Inc. ๐ Website: eulerwa.com ๐ Tutorials: eulerwa.com/en/products/eulerstack/tutorials/ ๐ง Tech contact: tech@eulerwa.com
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file eulerstack-0.0.1.dev1.tar.gz.
File metadata
- Download URL: eulerstack-0.0.1.dev1.tar.gz
- Upload date:
- Size: 161.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2f8cc4c83428f4dbec4a0a15dd20da00d2823218e9d357f4d3fb86757c123ba9
|
|
| MD5 |
f9de4626088b81cc5cc2a6bea57210b2
|
|
| BLAKE2b-256 |
c15b628a8b8d5a9d5efbe6fadebffaae86c544eaa79976e9219e757a25db0e30
|
File details
Details for the file eulerstack-0.0.1.dev1-py3-none-any.whl.
File metadata
- Download URL: eulerstack-0.0.1.dev1-py3-none-any.whl
- Upload date:
- Size: 136.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7f920548908fc34cd91394906fb896a7b0ca7c44c73d74c8d20e5c26750c1e59
|
|
| MD5 |
5b09eb90a648be2cbcaaa83b1415b869
|
|
| BLAKE2b-256 |
259496871e3ea0f0f9e6ff00523a5a7e1d29ac9068773ef70cfe8592d5c16be5
|