Skip to main content

Parallel Diffusion Language Model — 66 features

Project description

OMGFormers

Parallel Diffusion Language Model — 66 features

PyPI version License Python

GitHub: https://github.com/fastloraoffical/OMGformers


Installation

pip install omgformers

What is OMGFormers?

OMGFormers is a research-grade PyTorch library for building Parallel Diffusion Language Models. It provides a modular, composable set of building blocks covering:

  • Attention mechanisms — GQA, MLA, Sliding Window, Linear, Block-Sparse, Flash Attention 2, RoPE variants (YaRN, NTK, LongRoPE), ALiBi, T5 relative bias
  • Feed-forward layers — SwiGLU, GeGLU, ReGLU, Standard FFN
  • Mixture of Experts — Dense MoE, Soft MoE, load-balancing loss
  • Diffusion — Mask scheduler, Parallel decoder for masked diffusion LM training
  • LoRA / DoRA — Parameter-efficient fine-tuning adapters with merge/save/load
  • Training utilities — EMA, Lion optimizer, warm-up cosine schedules, FSDP, gradient checkpointing, checkpoint manager
  • Tokenizer — HuggingFace-compatible tokenizer with char-level fallback, special token management, encode/decode batch, mask_tokens for diffusion
  • Advanced — KV cache, multi-token prediction, model merging (SLERP, DARE, TIES), reward model, PPO, int8/int4 quantization, GGUF export, RAG context injection, dynamic batching, chunked long-doc attention

Quick Start

import torch
from omgformers import OMGConfig, OMGModel, create_base_model, OMGTokenizer

# Build a small model
cfg = OMGConfig(
    vocab_size=32000,
    hidden_size=512,
    num_layers=6,
    num_heads=8,
)
model = OMGModel(cfg)

# Or use the fast initializer
model, cfg = create_base_model(hidden_size=512, num_layers=6)

# Tokenizer
tok = OMGTokenizer.from_pretrained("gpt2")  # or char-level fallback
ids = tok.encode("Hello, world!")
print(tok.decode(ids))

Fine-tuning

from omgformers import FineTuneConfig, FineTuner

ft_cfg = FineTuneConfig(method="lora", lora_rank=16, steps=1000)
tuner  = FineTuner(model, tokenizer=tok, config=ft_cfg)
tuner.train(train_dataloader)

LoRA

from omgformers import add_lora, merge_lora, save_lora, LoRAConfig

lora_cfg = LoRAConfig(rank=16, alpha=32, target_modules=["q_proj", "v_proj"])
model    = add_lora(model, lora_cfg)
# ... train ...
model    = merge_lora(model)
save_lora(model, "my_lora_weights/")

Mixture of Experts

from omgformers import MoEConfig, OMGConfig

cfg = OMGConfig(
    hidden_size=1024,
    num_layers=12,
    moe=MoEConfig(num_experts=8, top_k=2, aux_loss_coeff=0.01),
)

What's New in v2.0.6-preview

Feature Description
#53 Fast base model initialization (create_base_model)
#54 Fine-tuning engine (FineTuner)
#55 Resume from checkpoint (Trainer.resume_from_checkpoint)
#56 Checkpoint manager (CheckpointManager)
#57 Flash Attention 2 real implementation (flash_attention_forward)
#58 MoE → OMGConfig full integration (MoEConfig)
#61 OMGTokenizer (HF + char-level fallback)
#62 Special token management
#63 Tokenizer save/load
#64 Tokenizer from_pretrained
#65 encode_batch / decode_batch
#66 mask_tokens for diffusion training

Bug fixes: #T5, #T6, #T7, #Mo1–Mo4, #A1–A5, #M1–M3, #C1–C3


Requirements

  • Python ≥ 3.9
  • PyTorch ≥ 2.0
  • Optional: transformers, safetensors, flash-attn, bitsandbytes

License

Apache License 2.0 — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omgformers-2.0.6.dev0.tar.gz (87.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

omgformers-2.0.6.dev0-py3-none-any.whl (90.3 kB view details)

Uploaded Python 3

File details

Details for the file omgformers-2.0.6.dev0.tar.gz.

File metadata

  • Download URL: omgformers-2.0.6.dev0.tar.gz
  • Upload date:
  • Size: 87.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.7

File hashes

Hashes for omgformers-2.0.6.dev0.tar.gz
Algorithm Hash digest
SHA256 3496587fcb41e2c5ab2decc6e541177cfd645e9df6618c05946cd3fe540a41de
MD5 a2f90d81c139d8ed4db19b2e7ad95b99
BLAKE2b-256 63bbd86293d94f99828bd98bd9dc461413124c9fb0fbdf774d2097035a59048b

See more details on using hashes here.

File details

Details for the file omgformers-2.0.6.dev0-py3-none-any.whl.

File metadata

File hashes

Hashes for omgformers-2.0.6.dev0-py3-none-any.whl
Algorithm Hash digest
SHA256 f3daae4c55b5383dd0e7de7b3e3ceec9af1ee6009184ebd12c47c2a779a7f627
MD5 77488a242fa3da6f6c3cbf306a3a6dfd
BLAKE2b-256 f927dfd7076555e59835025fe2348380844b30d73ee5e1d73fe75a51b0f7d17f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page