Directive-driven local LLM training, retraining, and export from .dlm documents, codebases, and multimodal sources.
Project description
DocumentLanguageModel
.dlmis a trainable local AI document format: typed sections, directives, replay-backed retraining, and export.
DocumentLanguageModel (DLM) is a local-first training, inference, and export toolchain built around authored documents instead of hosted dashboards.
A .dlm can be:
- a hand-written training document with prose, instruction, and preference data
- a directive-driven entrypoint into a codebase or notes tree
- a multi-adapter project with learned routing
- a selected multimodal or audio-language document
DLM trains LoRA / QLoRA / DoRA adapters on real pretrained bases, keeps a replay
history so retrains do not silently forget, and exports local runtimes such as
Ollama, llama-server, vllm, and mlx-serve.
Status: pre-v1.0, but far beyond the original MVP framing. The core
author/train/prompt/export/pack/share loop is real, and newer runtime-target
work is landing incrementally. Current export targets are ollama,
llama-server, vllm, and mlx-serve.
What A .dlm Actually Is
A .dlm is not just “a text file with a special extension.”
It is a trainable project surface with:
- frontmatter for base-model choice, training config, export defaults, sources, cache policy, and multi-adapter gate settings
- typed body sections such as prose,
::instruction::,::preference::,::image::, and::audio:: - adapter routing via fences like
::instruction#knowledge:: - directive-driven ingestion from files and directories through
training.sources - repo-local subtree control through
.dlm/training.yamland.dlm/ignore - a stable
dlm_idthat binds the document to a local store under~/.dlm/store/<dlm_id>/
That combination is what makes DLM more like a local AI authoring format than a single prompt file.
Why DLM
Most “personal AI” tooling still pushes you toward one of two bad choices:
- upload your data to someone else’s cloud
- run an oversized model with weak authoring and retraining ergonomics
DLM sits in the gap:
- The document is the interface. You author the thing you care about instead of wiring together a hidden dataset pipeline.
- Training is real. LoRA / QLoRA / DoRA on pretrained bases, not a toy from-scratch transformer.
- Retraining is additive. Previous document versions flow into a replay corpus so the model does not forget last week’s state by default.
- Everything stays local. Training, inference, store state, exports, and packs all live on your machine unless you explicitly push them somewhere.
- Determinism is a contract. Locks, pinned versions, and golden checks are first-class design constraints, not “best effort.”
Core Capabilities
- Author structured training data in one place. Mix prose, SFT examples, preferences, image sections, and audio sections in one document.
- Ingest whole trees, not just one file.
training.sourcescan walk a repo, and subtree-local.dlm/training.yaml/.dlm/ignorelet the corpus carry its own curation rules. - Train on modern base families. Text, reasoning-tuned, sparse-MoE,
vision-language, and audio-language registry rows ship today, plus
hf:org/nameescape hatches. - Compose multiple adapters in one document. Named adapters, weighted export
mixes, and learned adapter gates let one
.dlmseparate knowledge, tone, or persona lanes. - Mine preference pairs from a live adapter.
dlm preference minecan usesway, HF reward models, or external CLI judges to write auto-mined::preference::sections back into the document. - Stay in a local iteration loop.
dlm prompt,dlm repl,dlm train --watch,dlm metrics, anddlm doctorare all part of the normal workflow now. - Export beyond the original Ollama-only story. DLM still does explicit
Ollama exports with pinned templates, and now also emits
llama-server,vllm, andmlx-servelaunch artifacts for local runtime targets. - Close the eval loop.
dlm harvestcan pull failingsway-style probe reports back into the document as new training examples. - Pack and share reproducibly.
.dlm.pack, verification, push/pull, and local serve flows are all built around the same store contracts.
Supported Platforms
| Tier | Training | Inference / export |
|---|---|---|
| NVIDIA CUDA (SM ≥ 8.0) | bf16 + QLoRA 4-bit + FlashAttention | Ollama, GGUF export, llama-server, vllm |
| NVIDIA CUDA (SM < 8.0) | fp16 LoRA | Ollama, GGUF export, llama-server, vllm |
| Apple Silicon (MPS) | fp16 or fp32 LoRA depending on doctor plan | Ollama, selected MLX inference paths, GGUF export, vllm (conservative Metal defaults), mlx-serve |
| CPU | inference-first; training refused above small bases unless forced | GGUF export, Ollama, llama-server |
| AMD ROCm | experimental | ROCm-oriented llama.cpp flows |
See docs/hardware and docs/hardware/vl-memory.md for the real support matrix and current caveats.
Install
From the Homebrew tap
brew tap tenseleyFlow/tap
brew install dlm
# Optional, only if you want `--target ollama` registration/smoke:
brew install ollama
brew install dlm pulls in the Python environment and the vendored
llama.cpp source tree DLM uses for GGUF conversion. CUDA users unlock QLoRA
after install:
$(brew --prefix dlm)/libexec/venv/bin/pip install 'dlm[cuda]'
From source
git clone https://github.com/tenseleyFlow/DocumentLanguageModel.git
cd DocumentLanguageModel
uv sync
# Build GGUF tooling:
scripts/bump-llama-cpp.sh build
# If you want the llama.cpp HTTP target too:
scripts/bump-llama-cpp.sh build --with-server
# If you want the Apple Silicon MLX HTTP target:
uv sync --extra mlx
# If you want the vLLM HTTP target:
# install a compatible vllm runtime separately; DLM writes launch artifacts
# but does not bundle the server runtime itself.
uv run dlm --help
We deliberately do not publish to PyPI yet. See CONTRIBUTING.md for the release flow.
30-Second Start
uv run dlm init tutor.dlm --base smollm2-135m
$EDITOR tutor.dlm
uv run dlm train tutor.dlm
uv run dlm prompt tutor.dlm "What is a Python decorator?"
uv run dlm export tutor.dlm --target ollama --name my-tutor
A minimal .dlm still works:
---
dlm_id: 01KPM5CXB51GRX86Q25AKERN6E
dlm_version: 1
base_model: smollm2-135m
---
# Your document title
Write prose here.
::instruction::
### Q
What is a decorator?
### A
A function that takes a function and returns a wrapped function.
That path is still important. It is just no longer the whole story.
Authoring Beyond The Toy Example
A more representative .dlm can mix directives, named adapters, and export
defaults in one place:
---
dlm_id: 01KTESTEXAMPLE000000000000
dlm_version: 1
base_model: qwen3-1.7b
system_prompt: |
You are a concise engineering assistant.
training:
adapter: lora
sequence_len: 4096
sources_policy: strict
sources:
- path: ./src
include: ["**/*.py", "**/*.md"]
exclude: ["tests/**", "**/__pycache__/**"]
adapters:
knowledge:
adapter: lora
lora_r: 8
tone:
adapter: lora
lora_r: 4
gate:
enabled: true
export:
default_quant: Q4_K_M
---
# Project notes
Shared prose trains all declared adapters by default.
::instruction#knowledge::
### Q
What does the cache layer do?
### A
It avoids re-tokenizing unchanged directive-sourced files.
::preference#tone::
### Prompt
Explain a failure mode.
### Chosen
Explain it directly, then give the fix.
### Rejected
Over-explain the background before naming the problem.
Two important upgrades over the older README story:
training.sourcescan turn a repo or notes tree into synthetic training sections.training.adapters+training.gatelet one document route prompts across multiple adapters instead of pretending one flat adapter is the only mode.
If you need deeper subtree-specific curation, drop .dlm/training.yaml and
.dlm/ignore into nested directories and let the corpus carry its own rules.
Common Workflows
1. Hand-authored document
uv run dlm init tutor.dlm --base smollm2-135m
uv run dlm train tutor.dlm
uv run dlm prompt tutor.dlm "Explain decorators"
2. Train across a codebase
uv run dlm train ./my-repo --base qwen3-1.7b --include '**/*.py' --name corpus
That auto-scaffolds a .dlm under ./my-repo/.dlm/ and lets the repo become
its own training surface.
3. Multi-adapter composition
uv run dlm prompt mydoc.dlm "Explain the runbook" --adapter knowledge
uv run dlm export mydoc.dlm --adapter-mix knowledge:1.0,tone:0.5
4. Local iteration loop
uv run dlm train mydoc.dlm --watch
uv run dlm repl mydoc.dlm
uv run dlm metrics mydoc.dlm
5. Export and ship
uv run dlm export mydoc.dlm --target ollama --name mydoc
uv run dlm export mydoc.dlm --target llama-server
uv run dlm export mydoc.dlm --target vllm
uv run dlm export mydoc.dlm --target mlx-serve
uv run dlm pack mydoc.dlm --include-exports
uv run dlm verify mydoc.dlm.pack
On Apple Silicon, --target vllm now emits conservative vllm-metal
defaults in the launch script: it pins the server to the MLX KV path
(VLLM_METAL_USE_PAGED_ATTENTION=0, VLLM_METAL_MEMORY_FRACTION=auto)
and caps --max-model-len to the document's training.sequence_len
instead of blindly asking vllm for the base model's full context.
6. Mine preference pairs and retrain
uv run dlm preference mine mydoc.dlm --samples 4 --max-pairs 8
uv run dlm preference list mydoc.dlm
uv run dlm preference apply mydoc.dlm
uv run dlm train mydoc.dlm --phase preference
# A/B check against hand-authored pairs only:
uv run dlm train mydoc.dlm --phase preference --no-mined
# Use a different judge when bootstrap self-judging is not enough:
uv run dlm preference mine mydoc.dlm --judge hf:YourOrg/reward-model --apply
7. Scaffold multimodal or audio docs
uv run dlm init diagrams.dlm --multimodal --base qwen2-vl-2b-instruct
uv run dlm train diagrams.dlm
uv run dlm prompt diagrams.dlm --image figures/system.png "What is happening here?"
uv run dlm init calls.dlm --audio
uv run dlm train calls.dlm
uv run dlm prompt calls.dlm --audio clips/example.wav "Summarize the clip"
8. Pull eval failures back into training
uv run dlm harvest mydoc.dlm --sway-json sway-report.json --apply
That is the probe-driven loop: evaluation finds a miss, DLM turns it into document-level training data, and the next train closes the gap.
9. Inspect store state and reproducibility
uv run dlm doctor
uv run dlm show mydoc.dlm --json
uv run dlm metrics mydoc.dlm --run-id 7 --json
uv run dlm pack mydoc.dlm --include-exports
uv run dlm verify mydoc.dlm.pack
Command Surface
The CLI is broader than the original MVP now. A useful mental map:
| Area | Commands | What they cover |
|---|---|---|
| Author | init, templates, show, migrate, cache |
Create docs, inspect them, migrate schema, manage cache state |
| Train | train, doctor, metrics, harvest |
Run training, inspect plans, observe runs, pull eval misses back in |
| Align | preference |
Mine, stage, apply, revert, and inspect auto-mined preference sections |
| Infer | prompt, repl |
Local interactive and one-shot inference |
| Ship | export, pack, unpack, verify, push, pull, serve |
Export to runtimes, bundle, verify, and move artifacts |
See the CLI reference for the full flag surface.
Documentation
- Getting started
- Frontmatter reference
- Section grammar
- Preference section reference
- Training across codebases
- Train from a folder
- Multi-source training
- Tokenized-section cache
- Multi-adapter composition
- Learned adapter gate
- Self-improving loop / preference mining
- Reward-model integration
- Multimodal training
- Audio training
- Probe-driven training / sway harvest
- Multi-target export
- Sharing adapters and packs
- CLI reference
- Architecture
- Determinism
Principles
- The document is the interface. But the document is structured: frontmatter, typed sections, directives, and store contracts all matter.
- Training is real. LoRA / QLoRA / DoRA on pretrained bases, not a toy transformer.
- Retraining should not silently forget. Replay-backed accumulation is part of the product.
- Local-first is load-bearing. Your training data, adapters, exports, and packs stay on your machine unless you explicitly move them.
- Determinism is a contract. If a change breaks the reproducibility story, that is a product regression.
Tech Stack
Python 3.11+ · PyTorch · HuggingFace transformers / peft / trl /
accelerate / datasets · watchfiles · prompt-toolkit · safetensors ·
vendored llama.cpp for GGUF export · Ollama (optional runtime target) ·
Typer · Pydantic · uv
Contributing
See CONTRIBUTING.md. Testing conventions live in docs-internal/README-testing.md.
uv run pre-commit install
License
MIT. Base-model licenses are separate and enforced where DLM needs them:
dlm init, dlm train, dlm export, and dlm pack all keep the gated-base
acceptance path explicit.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file document_language_model-0.10.0.tar.gz.
File metadata
- Download URL: document_language_model-0.10.0.tar.gz
- Upload date:
- Size: 66.0 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
460aa4f89dcbc76b237502efa5a1047e2c644d9cc117d069c9bdb45e5eb421cf
|
|
| MD5 |
fbe508158319c5205dc917bc4917b3ed
|
|
| BLAKE2b-256 |
318b91b7323e1c538f0a9354b364bacd3bf841cfe7a504c56ffc68661b608959
|
File details
Details for the file document_language_model-0.10.0-py3-none-any.whl.
File metadata
- Download URL: document_language_model-0.10.0-py3-none-any.whl
- Upload date:
- Size: 660.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.11.6 {"installer":{"name":"uv","version":"0.11.6","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
25fd04557bf08d0e814fc9f189479421de6abf507411574fa6488de74b121c29
|
|
| MD5 |
c7314c5988c1e90b3d4f6e1567165c3a
|
|
| BLAKE2b-256 |
9c8f75adf4d7a704415f10dbcf339327a7d6a65e615317db15b94f2234acef71
|