MorphFormer: multilingual morphological reinflection with character-level Transformer (GQA, RoPE, SwiGLU, language adapters)
Project description
MorphFormer
Character-level Transformer for multilingual morphological reinflection.
Version 3.1 — modular multi-package architecture by Voluntas Progressus.
Installation
pip install morphoformer
Requires Python >= 3.14 and PyTorch >= 2.0.
Dependencies are installed automatically:
chartoken-vp— character-level tokenizertorchblocks-vp— pluggable Transformer blockssigmorphon-vp— SigMorphon dataset toolstrainkit-vp— training utilities
Quick Start
# Download SigMorphon data
morphoformer download --lang rus,deu,fra --merge
# Train with preset
morphoformer train --preset medium --data "data/collections/*_train.tsv" --device cuda
# Single-word inference
morphoformer infer --checkpoint checkpoints/morph_v3.pt --word "laufen" --morph "V;IND;PST;3;SG" --lang deu
# Interactive REPL
morphoformer serve --checkpoint checkpoints/morph_v3.pt
Architecture
| Component | Implementation |
|---|---|
| Attention | Grouped Query Attention (GQA) with KV cache |
| Positions | Rotary Position Embeddings (RoPE) |
| Feed-forward | SwiGLU |
| Normalization | RMSNorm (pre-norm) |
| Encoder conv | Conformer-style depthwise separable conv1d |
| Adapters | Language-conditioned bottleneck adapters |
| Morph features | Pooled or attention-based structured encoder |
Presets
| Preset | d_model | Encoder | Decoder | ~Params | VRAM |
|---|---|---|---|---|---|
| small | 384 | 4 layers | 3 layers | ~7M | < 4 GB |
| medium | 512 | 8 layers | 6 layers | ~45M | 4-8 GB |
| large | 768 | 10 layers | 8 layers | ~120M | >= 8 GB |
CLI Commands
| Command | Description |
|---|---|
train |
Train model from TSV data |
infer |
Single-word inference |
serve |
Interactive REPL |
download |
Download SigMorphon 2021 datasets |
modules |
List registered NN building blocks |
init-config |
Generate TOML config template |
Data Format
TSV with columns: lemma, features, surface_form, language
laufen V;IND;PST;3;SG lief deu
Python API
import torch
from chartoken import CharVocab, FeatureVocab
from morphoformer.model import MorphFormer
from morphoformer.inference import greedy_decode
checkpoint = torch.load("checkpoints/morph_v3.pt", map_location="cpu", weights_only=False)
char_vocab = CharVocab.from_dict(checkpoint["char_vocab"])
feature_vocab = FeatureVocab.from_dict(checkpoint["feature_vocab"])
lang_to_id = checkpoint["lang_to_id"]
# Build model, load state_dict, call greedy_decode()
Supported Devices
| Device | Flag |
|---|---|
| Auto-detect | --device auto |
| NVIDIA GPU | --device cuda |
| AMD GPU | --device rocm |
| Intel Arc | --device xpu |
| Apple Silicon | --device mps |
| CPU | --device cpu |
License
MIT
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file morphoformer-3.1.0.tar.gz.
File metadata
- Download URL: morphoformer-3.1.0.tar.gz
- Upload date:
- Size: 19.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6502e2fbc367036e13fc8de7fd76b37e4a1aed13d57c5456c41808e67d667824
|
|
| MD5 |
04d8243f63e5a92d76ae5e0c87f4617b
|
|
| BLAKE2b-256 |
312d1607b8a9e69acdd233c92691fec3c929cbee22a0cf7868189217ff1a34f1
|
File details
Details for the file morphoformer-3.1.0-py3-none-any.whl.
File metadata
- Download URL: morphoformer-3.1.0-py3-none-any.whl
- Upload date:
- Size: 23.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
daba1bc168a4ee086da1d0b55d8d9dc63a2c34519044faea0f0c0e1251ded567
|
|
| MD5 |
8389ba5d0de75a665127950858f61ae1
|
|
| BLAKE2b-256 |
9773a75fe2f41bbc352c1b67a9b67051f79ebfe888751d20e5a46725305792db
|