DAG-guided discrete diffusion language models for reasoning

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

BDeMo

These details have not been verified by PyPI

Project description

dLLM-Reason

DAG-Guided Discrete Diffusion Language Models for Reasoning

Overview

dLLM-Reason is a research framework that enhances reasoning in discrete diffusion language models (dLLMs) by controlling the token unmasking order via DAG (Directed Acyclic Graph) topological structures.

Core idea: dLLMs generate text by iteratively unmasking tokens. We impose a DAG on unmasking order — edges encode reasoning dependencies — so prerequisite steps are generated before downstream conclusions.

Model Layer          Scheduler Layer          DAG Layer
(what to predict) <-> (where to unmask)  <-> (dependency structure)
MDLM|SEDD|D3PM|LLaDA  13 schedulers          TokenDAG / SpanDAG

Installation

# From PyPI
pip install dllm-reason

# From GitHub (latest dev)
pip install "git+https://github.com/BDeMo/dLLM_Reason.git"

# With all extras
pip install "dllm-reason[dev,library,serve]"

# Editable (development)
git clone https://github.com/BDeMo/dLLM_Reason.git
cd dLLM_Reason
pip install -e ".[dev,library,serve]"

Optional Extras

Extra	Packages	Purpose
`dev`	pytest, pytest-cov, ruff	Testing and linting
`library`	faiss-cpu, sentence-transformers, scikit-learn	DAG Library retrieval
`serve`	fastapi, uvicorn, bitsandbytes	REST API serving + quantization

CLI Commands

After installation, the following commands are available globally:

Command	Equivalent	Description
`dllm-eval-dags`	`python scripts/eval_dags.py`	Multi-strategy x multi-benchmark evaluation
`dllm-serve`	`python scripts/serve.py`	REST API server with hot-switching strategies
`dllm-train`	`python scripts/train.py`	Model training
`dllm-eval`	`python scripts/evaluate.py`	Single-model evaluation
`dllm-search`	`python scripts/search_dag.py`	DAG structure search
`dllm-viz`	`python scripts/visualize_dag.py`	DAG visualization
`dllm-webui`	`python scripts/webui.py`	Interactive Web UI dashboard

Quick Start

# 1. Download model & datasets
python scripts/download_models.py              # -> checkpoints/llada-instruct/
python scripts/download_datasets.py            # -> datasets/

# China HuggingFace mirror
python scripts/download_models.py --mirror https://hf-mirror.com
python scripts/download_datasets.py --mirror https://hf-mirror.com

# 2. Smoke test (5 samples, confidence strategy)
dllm-eval-dags --dags confidence --benchmarks mbpp --num_samples 5

# 3. Full comparison (all 13 strategies x all 10 benchmarks)
bash scripts/runs/full_comparison.sh

# 4. Start REST API server
dllm-serve --model_id checkpoints/llada-instruct --quantize 4bit

Usage

All parameters live in configs/eval_default.yaml. CLI flags always override the config.

Evaluation

# Default run (LLaDA + confidence)
bash scripts/run_eval.sh

# Direct CLI — pick strategies and benchmarks
dllm-eval-dags \
    --dags confidence entropy adaptive_dynamic cot \
    --benchmarks gsm8k math mbpp humaneval \
    --num_steps 128 --num_samples 100 \
    --save_outputs \
    --output_dir results/my_run

# Per-strategy convenience scripts
bash scripts/runs/confidence.sh       # highest-confidence first (LLaDA default)
bash scripts/runs/random.sh           # uniform random
bash scripts/runs/entropy.sh          # lowest-entropy first
bash scripts/runs/semi_ar.sh          # block-by-block L->R
bash scripts/runs/linear.sh           # strict left-to-right
bash scripts/runs/cot.sh              # Chain-of-Thought DAG
bash scripts/runs/skeleton.sh         # skeleton-then-detail DAG
bash scripts/runs/bidirectional.sh    # both ends toward center
bash scripts/runs/answer_first.sh     # answer region first
bash scripts/runs/all_strategies.sh   # all 13 strategies in one run
bash scripts/runs/full_comparison.sh  # 13 strategies x 10 benchmarks

# All scripts pass extra args through:
bash scripts/runs/cot.sh --benchmarks mbpp humaneval --num_samples 100 --cot_steps 6

Single-Prompt Inference

python scripts/infer_llada.py \
    --model_id checkpoints/llada-instruct \
    --prompt "What is 7 * 8?" \
    --num_steps 128 --block_length 32 --temperature 0.0

REST API Serving

# Install serving extras
pip install "dllm-reason[serve]"

# Start server (bfloat16 / port 8000)
dllm-serve --model_id checkpoints/llada-instruct

# With 4-bit quantization (~5 GB VRAM)
dllm-serve --model_id checkpoints/llada-instruct --quantize 4bit

# Generate with any strategy — hot-switchable, no model reload
curl -X POST http://localhost:8000/generate \
  -H "Content-Type: application/json" \
  -d '{"prompt": "What is 7*8?", "strategy": "adaptive_dynamic", "max_new_tokens": 256}'

# List strategies
curl http://localhost:8000/strategies

# Health check
curl http://localhost:8000/health

Save Per-Sample Outputs

dllm-eval-dags --dags confidence cot adaptive_dynamic \
    --save_outputs \
    --output_dir results/detailed

# Output files per (benchmark, strategy):
#   {bench}_{dag}_samples.json   — prompt, generated, ground truth, pass/fail
#   {bench}_{dag}_samples.xlsx   — same in spreadsheet format
#   {bench}_{dag}_trajectory.json — per-step unmasking states (with --record_trajectory)

Web UI (Interactive Dashboard)

# Install serving extras
pip install "dllm-reason[serve]"

# Launch Web UI (loads model + opens browser dashboard)
dllm-webui --model_id checkpoints/llada-instruct --port 7860

# With quantization
dllm-webui --model_id checkpoints/llada-instruct --quantize 4bit

# Results viewer only (no model, no GPU needed)
dllm-webui --no_model --port 7860

Features:

Generate: Interactive text generation with strategy selector
Compare: Side-by-side comparison of multiple strategies on the same prompt
Trajectory: Visualize step-by-step unmasking progression
Results: Browse and compare benchmark results (reads results/ directory)

LaTeX Table Generation

# Generate publication-ready comparison table from results
python scripts/generate_latex_table.py results/summary.json --output paper_table.tex

Config File

# configs/eval_default.yaml
model:
  model_id: "checkpoints/llada-instruct"
  torch_dtype: "bfloat16"

inference:
  num_steps: 128
  block_length: 32
  temperature: 0.0
  cfg_scale: 0.0
  remasking: "low_confidence"
  max_new_tokens: 128

benchmarks:
  benchmarks: ["mbpp", "humaneval"]
  num_samples: null       # null = full dataset

dags:
  dags: ["confidence"]
  # choices: confidence | random | entropy | semi_ar | maskgit_cosine
  #          | critical_token_first | curriculum | linear | cot
  #          | skeleton | bidirectional | answer_first | adaptive_dynamic

13 Unmasking Strategies

Strategy	Type	Description
`confidence`	Flat	Unmask highest model confidence first (LLaDA default)
`random`	Flat	Uniform random unmasking
`entropy`	Flat	Lowest-entropy (most certain by distribution) first
`semi_ar`	Flat	Block-by-block left-to-right, confidence within block
`maskgit_cosine`	Flat	MaskGIT cosine schedule: more tokens early, fewer later
`critical_token_first`	Flat	Highest KL divergence from uniform (most influential) first
`curriculum`	Flat	Easy tokens first (high confidence + low entropy)
`linear`	Flat	Strict left-to-right sequential
`cot`	DAG	Chain-of-Thought: reasoning segments before answer
`skeleton`	DAG	Structural tokens first, then fill details
`bidirectional`	DAG	Both ends toward center
`answer_first`	DAG	Answer region unmasked before reasoning
`adaptive_dynamic`	Dynamic	Dynamic soft DAG — constructs pairwise influence graph at runtime (ours)

10 Benchmarks

Benchmark	Type	Metric	Dataset
`mbpp`	Code generation	pass@1	Google MBPP (Python)
`humaneval`	Code generation	pass@1	OpenAI HumanEval (Python)
`gsm8k`	Math reasoning	exact match	Grade school math
`math`	Competition math	exact match	MATH (extracts `\boxed{}`)
`mmlu`	Knowledge	accuracy	57-subject multitask
`hotpotqa`	Multi-hop QA	EM / F1	Multi-hop reasoning
`arc`	Science reasoning	accuracy	ARC-Challenge
`prontoqa`	Logic reasoning	accuracy	Formal logic
`gpqa`	PhD-level science	accuracy	GPQA Diamond subset
`aime`	Competition math	accuracy	AMC/AIME (integer 000-999)

Project Structure

src/dllm_reason/
  models/          MDLM, SEDD, D3PM, LLaDA (4 dLLMs)
  graph/           TokenDAG, SpanDAG, 6 templates, constraints, visualization
  scheduler/       13 unmasking strategies (8 flat + 4 DAG + 1 adaptive dynamic)
  search/          Evolutionary, Greedy, RL Policy, NOTEARS, E2E DAG, NAS (6 search methods)
  inference/       DiffusionSampler (auto-pad, early-stop), DAGSampler
  training/        Pretrain, DAG-aware, Fine-tune, Diffu-GRPO
  eval/            10 benchmark evaluators, metrics, DAG analysis
  library/         DAG Library (store, retrieval, fusion, feedback, merge)
  data/            Dataset loaders (GSM8K, MATH, ARC, ProntoQA, ...)
  utils/           Registry, logging, distributed

configs/           31 YAML configs (model, graph, search, task, eval, experiment, library)
scripts/           8 Python scripts + 16 shell run scripts
tests/             DAG, schedulers, models, library (4 test suites)
notebooks/         DAG exploration, results analysis
docs/              Version history, API reference, deployment guide, tutorial, references

Key Components

TokenDAG

Boolean adjacency matrix on GPU. Edge (i, j) = "position i must unmask before j".

from dllm_reason.graph.dag import TokenDAG

dag = TokenDAG.linear_chain(seq_len=256)
ready = dag.ready_positions(is_unmasked)  # one batched GPU op

6 templates: Chain-of-Thought, Answer-First, Skeleton-Detail, Bidirectional, Interleaved, Random.

SpanDAG

Coarse-grained DAG over token spans — reduces search space by span_size^2.

from dllm_reason.graph.span_dag import SpanDAG

sdag = SpanDAG.cot(num_spans=8, span_size=32, num_reasoning_steps=4)
token_dag = sdag.to_token_dag()  # expand for scheduler

DAGScheduler

DAG constraints inject at scheduler layer — models need zero modification.

from dllm_reason.scheduler.dag_scheduler import DAGScheduler
scheduler = DAGScheduler(dag, sub_strategy="confidence_topk")

Adaptive Dynamic DAG (Novel)

Constructs soft dependency graph at runtime based on pairwise influence between masked positions.

from dllm_reason.scheduler.adaptive_dynamic_scheduler import AdaptiveDynamicScheduler
scheduler = AdaptiveDynamicScheduler(influence_threshold=0.3, momentum=0.5)

DAG Search (6 Methods)

Automatically discover optimal DAG structures.

# Evolutionary search
from dllm_reason.search.evolutionary import EvolutionarySearch
searcher = EvolutionarySearch(population_size=20, library=dag_store)
result = searcher.search(model, eval_fn, seq_len=256, budget=200)

# End-to-end DAG learning (differentiable, jointly with task loss)
from dllm_reason.search.e2e_dag_learner import E2EDAGLearner, E2EConfig
learner = E2EDAGLearner(config=E2EConfig(lr_dag=3e-3))
result = learner.search(model, eval_fn, seq_len=256, budget=200)

# NAS-style search (DARTS supernet or ENAS controller)
from dllm_reason.search.nas_search import NASDAGSearch, NASConfig
searcher = NASDAGSearch(config=NASConfig(mode="supernet", span_size=16))
result = searcher.search(model, eval_fn, seq_len=256, budget=200)

Method	Type	Description
Greedy	Black-box	Add/remove edges iteratively
Evolutionary	Black-box	Population-based with tournament selection
RL Policy	Black-box	GRU + REINFORCE for edge construction
Differentiable	Gradient	NOTEARS continuous relaxation
E2E DAG Learning	Gradient	Joint DAG + task loss optimization
NAS SuperNet/Controller	Gradient/RL	DARTS or ENAS-style architecture search

DAG Library

Persistent storage + retrieval + feedback for DAG structures.

Store: SQLite + FAISS vector index
Retrieval: 3 channels (semantic, structural, performance)
Fusion: 4 strategies (weighted, RRF, max, voting)
Feedback: 3 sources (auto benchmark, human rating, Elo tournament)
Merge: 3 strategies (union, intersection, weighted)

All components independently toggleable for ablation. 7 preset configs in configs/library/.

Models

Model	Type	Reference
LLaDA	LLaMA-3 masked diffusion (8B)	Nie et al., 2025
MDLM	Absorbing-state continuous-time	Sahoo et al., 2024
SEDD	Score-entropy discrete diffusion	Lou et al., 2024
D3PM	Discrete-time structured transitions	Austin et al., 2021

Configuration

All configs use YAML + Hydra/OmegaConf.

Directory	Contents
`configs/model/`	Model hyperparameters (mdlm, sedd, d3pm, llada)
`configs/graph/`	DAG template parameters
`configs/search/`	Search algorithm settings
`configs/task/`	Dataset configs
`configs/eval/`	Benchmark settings
`configs/experiment/`	End-to-end experiment combinations
`configs/library/`	DAG Library ablation variants
`configs/eval_default.yaml`	Default evaluation config

Documentation

License

MIT License. See LICENSE for details.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

BDeMo

These details have not been verified by PyPI

Release history Release notifications | RSS feed

1.6.0

Apr 20, 2026

1.5.3

Apr 10, 2026

1.5.2

Apr 10, 2026

1.5.1

Apr 10, 2026

1.5.0

Apr 10, 2026

1.4.3

Apr 10, 2026

1.4.2

Apr 9, 2026

This version

1.4.1

Apr 9, 2026

1.4.0

Apr 8, 2026

1.2.4

Apr 8, 2026

1.2.3

Apr 7, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dllm_reason-1.4.1.tar.gz (106.0 kB view details)

Uploaded Apr 9, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

dllm_reason-1.4.1-py3-none-any.whl (128.1 kB view details)

Uploaded Apr 9, 2026 Python 3

File details

Details for the file dllm_reason-1.4.1.tar.gz.

File metadata

Download URL: dllm_reason-1.4.1.tar.gz
Upload date: Apr 9, 2026
Size: 106.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dllm_reason-1.4.1.tar.gz
Algorithm	Hash digest
SHA256	`838f208dd3ec1f130114783e4bdd0e25c1d0363e861555e09e216a59440ffcb5`
MD5	`8512b8e05b0df6d40a40ec0369c42d74`
BLAKE2b-256	`2aa8b663e08be7b609ce292371171c2dc25236b55359e5a80b7f684fe348f290`

See more details on using hashes here.

Provenance

The following attestation bundles were made for dllm_reason-1.4.1.tar.gz:

Publisher: publish.yml on BDeMo/dLLM_Reason

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: dllm_reason-1.4.1.tar.gz
- Subject digest: 838f208dd3ec1f130114783e4bdd0e25c1d0363e861555e09e216a59440ffcb5
- Sigstore transparency entry: 1260627929
- Sigstore integration time: Apr 9, 2026
Source repository:
- Permalink: BDeMo/dLLM_Reason@2cf7a8d7dc4a79a8b2f8087d52a98b47599ead10
- Branch / Tag: refs/tags/v1.4.1
- Owner: https://github.com/BDeMo
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@2cf7a8d7dc4a79a8b2f8087d52a98b47599ead10
- Trigger Event: push

File details

Details for the file dllm_reason-1.4.1-py3-none-any.whl.

File metadata

Download URL: dllm_reason-1.4.1-py3-none-any.whl
Upload date: Apr 9, 2026
Size: 128.1 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dllm_reason-1.4.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5588fcc813e7ed55bd9b445e567a0113c221f86a33404e134c5093fe8e9cd538`
MD5	`f87bd3af2430fa4a89682f7cf358126b`
BLAKE2b-256	`b291a75b5112e284217f0a0c8ea81e604c76d89c2459a974be4b8df83a3c394f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for dllm_reason-1.4.1-py3-none-any.whl:

Publisher: publish.yml on BDeMo/dLLM_Reason

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: dllm_reason-1.4.1-py3-none-any.whl
- Subject digest: 5588fcc813e7ed55bd9b445e567a0113c221f86a33404e134c5093fe8e9cd538
- Sigstore transparency entry: 1260627933
- Sigstore integration time: Apr 9, 2026
Source repository:
- Permalink: BDeMo/dLLM_Reason@2cf7a8d7dc4a79a8b2f8087d52a98b47599ead10
- Branch / Tag: refs/tags/v1.4.1
- Owner: https://github.com/BDeMo
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@2cf7a8d7dc4a79a8b2f8087d52a98b47599ead10
- Trigger Event: push

dllm-reason 1.4.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

dLLM-Reason

Overview

Installation

Optional Extras

CLI Commands

Quick Start

Usage

Evaluation

Single-Prompt Inference

REST API Serving

Save Per-Sample Outputs

Web UI (Interactive Dashboard)

LaTeX Table Generation

Config File

13 Unmasking Strategies

10 Benchmarks

Project Structure

Key Components

TokenDAG

SpanDAG

DAGScheduler

Adaptive Dynamic DAG (Novel)

DAG Search (6 Methods)

DAG Library

Models

Configuration

Documentation

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance