Skip to main content

MorphFormer: multilingual morphological reinflection with character-level Transformer (GQA, RoPE, SwiGLU, language adapters)

Project description

MorphFormer

Character-level Transformer for multilingual morphological reinflection.

Version 3.1 — modular multi-package architecture by Voluntas Progressus.

Installation

pip install morphoformer

Requires Python >= 3.14 and PyTorch >= 2.0.

Dependencies are installed automatically:

  • chartoken-vp — character-level tokenizer
  • torchblocks-vp — pluggable Transformer blocks
  • sigmorphon-vp — SigMorphon dataset tools
  • trainkit-vp — training utilities

Quick Start

# Download SigMorphon data
morphoformer download --lang rus,deu,fra --merge

# Train with preset
morphoformer train --preset medium --data "data/collections/*_train.tsv" --device cuda

# Single-word inference
morphoformer infer --checkpoint checkpoints/morph_v3.pt --word "laufen" --morph "V;IND;PST;3;SG" --lang deu

# Interactive REPL
morphoformer serve --checkpoint checkpoints/morph_v3.pt

Architecture

Component Implementation
Attention Grouped Query Attention (GQA) with KV cache
Positions Rotary Position Embeddings (RoPE)
Feed-forward SwiGLU
Normalization RMSNorm (pre-norm)
Encoder conv Conformer-style depthwise separable conv1d
Adapters Language-conditioned bottleneck adapters
Morph features Pooled or attention-based structured encoder

Presets

Preset d_model Encoder Decoder ~Params VRAM
small 384 4 layers 3 layers ~7M < 4 GB
medium 512 8 layers 6 layers ~45M 4-8 GB
large 768 10 layers 8 layers ~120M >= 8 GB

CLI Commands

Command Description
train Train model from TSV data
infer Single-word inference
serve Interactive REPL
download Download SigMorphon 2021 datasets
modules List registered NN building blocks
init-config Generate TOML config template

Data Format

TSV with columns: lemma, features, surface_form, language

laufen	V;IND;PST;3;SG	lief	deu

Python API

import torch
from chartoken import CharVocab, FeatureVocab
from morphoformer.model import MorphFormer
from morphoformer.inference import greedy_decode

checkpoint = torch.load("checkpoints/morph_v3.pt", map_location="cpu", weights_only=False)
char_vocab = CharVocab.from_dict(checkpoint["char_vocab"])
feature_vocab = FeatureVocab.from_dict(checkpoint["feature_vocab"])
lang_to_id = checkpoint["lang_to_id"]

# Build model, load state_dict, call greedy_decode()

Supported Devices

Device Flag
Auto-detect --device auto
NVIDIA GPU --device cuda
AMD GPU --device rocm
Intel Arc --device xpu
Apple Silicon --device mps
CPU --device cpu

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

morphoformer-3.1.0.tar.gz (19.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

morphoformer-3.1.0-py3-none-any.whl (23.2 kB view details)

Uploaded Python 3

File details

Details for the file morphoformer-3.1.0.tar.gz.

File metadata

  • Download URL: morphoformer-3.1.0.tar.gz
  • Upload date:
  • Size: 19.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for morphoformer-3.1.0.tar.gz
Algorithm Hash digest
SHA256 6502e2fbc367036e13fc8de7fd76b37e4a1aed13d57c5456c41808e67d667824
MD5 04d8243f63e5a92d76ae5e0c87f4617b
BLAKE2b-256 312d1607b8a9e69acdd233c92691fec3c929cbee22a0cf7868189217ff1a34f1

See more details on using hashes here.

File details

Details for the file morphoformer-3.1.0-py3-none-any.whl.

File metadata

  • Download URL: morphoformer-3.1.0-py3-none-any.whl
  • Upload date:
  • Size: 23.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.2

File hashes

Hashes for morphoformer-3.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 daba1bc168a4ee086da1d0b55d8d9dc63a2c34519044faea0f0c0e1251ded567
MD5 8389ba5d0de75a665127950858f61ae1
BLAKE2b-256 9773a75fe2f41bbc352c1b67a9b67051f79ebfe888751d20e5a46725305792db

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page