Skip to main content

MorphFormer: multilingual morphological reinflection with character-level Transformer (GQA, RoPE, SwiGLU, language adapters)

Project description

MorphFormer

Character-level Transformer for multilingual morphological reinflection.

Version 3.1 — modular multi-package architecture by Voluntas Progressus.

Installation

pip install morphoformer

Requires Python >= 3.14 and PyTorch >= 2.0.

Dependencies are installed automatically:

  • chartoken-vp — character-level tokenizer
  • torchblocks-vp — pluggable Transformer blocks
  • sigmorphon-vp — SigMorphon dataset tools
  • trainkit-vp — training utilities

Quick Start

# Download SigMorphon data
morphoformer download --lang rus,deu,fra --merge

# Train with preset
morphoformer train --preset medium --data "data/collections/*_train.tsv" --device cuda

# Single-word inference
morphoformer infer --checkpoint checkpoints/morph_v3.pt --word "laufen" --morph "V;IND;PST;3;SG" --lang deu

# Interactive REPL
morphoformer serve --checkpoint checkpoints/morph_v3.pt

Architecture

Component Implementation
Attention Grouped Query Attention (GQA) with KV cache
Positions Rotary Position Embeddings (RoPE)
Feed-forward SwiGLU
Normalization RMSNorm (pre-norm)
Encoder conv Conformer-style depthwise separable conv1d
Adapters Language-conditioned bottleneck adapters
Morph features Pooled or attention-based structured encoder

Presets

Preset d_model Encoder Decoder ~Params VRAM
small 384 4 layers 3 layers ~7M < 4 GB
medium 512 8 layers 6 layers ~45M 4-8 GB
large 768 10 layers 8 layers ~120M >= 8 GB

CLI Commands

Command Description
train Train model from TSV data
infer Single-word inference
serve Interactive REPL
download Download SigMorphon 2021 datasets
modules List registered NN building blocks
init-config Generate TOML config template

Data Format

TSV with columns: lemma, features, surface_form, language

laufen	V;IND;PST;3;SG	lief	deu

Python API

import torch
from chartoken import CharVocab, FeatureVocab
from morphoformer.model import MorphFormer
from morphoformer.inference import greedy_decode

checkpoint = torch.load("checkpoints/morph_v3.pt", map_location="cpu", weights_only=False)
char_vocab = CharVocab.from_dict(checkpoint["char_vocab"])
feature_vocab = FeatureVocab.from_dict(checkpoint["feature_vocab"])
lang_to_id = checkpoint["lang_to_id"]

# Build model, load state_dict, call greedy_decode()

Supported Devices

Device Flag
Auto-detect --device auto
NVIDIA GPU --device cuda
AMD GPU --device rocm
Intel Arc --device xpu
Apple Silicon --device mps
CPU --device cpu

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

morphoformer-3.2.0.tar.gz (20.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

morphoformer-3.2.0-py3-none-any.whl (23.9 kB view details)

Uploaded Python 3

File details

Details for the file morphoformer-3.2.0.tar.gz.

File metadata

  • Download URL: morphoformer-3.2.0.tar.gz
  • Upload date:
  • Size: 20.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for morphoformer-3.2.0.tar.gz
Algorithm Hash digest
SHA256 b2d8dd0a0fac1458898cc31856683b099d8f4f30ca98cae4b20882ce0b925ced
MD5 41b0f26b740eb8a2e0ac2d6899ad2d81
BLAKE2b-256 4d9f5e459201a9cd7b0611bc374d6dc6a9a272c0348d47967b237e91e14120b3

See more details on using hashes here.

File details

Details for the file morphoformer-3.2.0-py3-none-any.whl.

File metadata

  • Download URL: morphoformer-3.2.0-py3-none-any.whl
  • Upload date:
  • Size: 23.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for morphoformer-3.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 82628acd23b857d32208728f8e4342b0130d19d75c7f9e47c88a19f1e03623dd
MD5 77c32ff5f7012d08a5fdfeb61c9bd446
BLAKE2b-256 a9451ec37838c2da8d9957325ca083ac594aa7dfd6b5af809e9bca52c447028f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page