Skip to main content

Transformer implementations for language modeling and ASR, with numpy-based attention primitives.

Project description

custom-transformer

A PyTorch-based transformer library for language modeling and automatic speech recognition (ASR). Includes both a numpy-based multi-head attention implementation (mytorch) and a full PyTorch transformer toolkit (transformerlib) with training, decoding, and data loading utilities.

Installation

pip install custom-transformer

Features

mytorch — NumPy Attention Primitives

Pure numpy implementations for educational and prototyping purposes:

  • Linear — fully-connected layer with forward and backward passes
  • Softmax — numerically stable softmax activation
  • ScaledDotProductAttention — attention mechanism with optional masking
  • MultiHeadAttention — multi-head attention with split/concat head logic
from mytorch.nn import MultiHeadAttention
mha = MultiHeadAttention(d_model=512, num_heads=8)

transformerlib — PyTorch Transformer Toolkit

Full-featured transformer models and training infrastructure:

Models

  • DecoderOnlyTransformer — GPT-style causal language model
  • EncoderDecoderTransformer — encoder-decoder for sequence-to-sequence tasks (e.g., ASR)
  • Pre-LN architecture with sinusoidal positional encoding
  • Weight tying, layer dropout, and mixed precision support
  • from_pretrained_decoder for initializing encoder-decoder models from pretrained decoder weights

Data

  • LMDataset — text dataset with tokenization, SOS/EOS framing, and collation
  • ASRDataset — speech dataset with filterbank features, global MVN normalization, and SpecAugment
  • H4Tokenizer — BPE tokenizer wrapper (char, 1k, 5k, 10k vocab sizes included)

Training

  • LMTrainer — language model training with gradient accumulation, mixed precision, and WandB logging
  • ASRTrainer — ASR training with CTC + cross-entropy joint loss
  • ProgressiveTrainer — curriculum learning with gradual layer unfreezing and data subsetting

Decoding

  • SequenceGenerator — greedy search, beam search, and nucleus sampling
  • Language model shallow fusion for ASR recognition

Quick Start

from transformerlib.model import DecoderOnlyTransformer

model = DecoderOnlyTransformer(
    num_layers=6,
    d_model=512,
    num_heads=8,
    d_ff=2048,
    dropout=0.1,
    max_len=512,
    num_classes=10000
)
from transformerlib.model import EncoderDecoderTransformer

model = EncoderDecoderTransformer(
    num_encoder_layers=6,
    num_decoder_layers=6,
    d_model=512,
    num_heads=8,
    d_ff=2048,
    dropout=0.1,
    max_len=512,
    num_classes=1000,
    feat_dim=80
)

Architecture

mytorch/
  nn/
    linear.py
    activation.py
    scaled_dot_product_attention.py
    multi_head_attention.py

transformerlib/
  model/        — masks, positional encoding, sublayers, encoder/decoder layers, transformers
  data/         — tokenizer, LM dataset, ASR dataset
  trainers/     — base trainer, LM trainer, ASR trainer, progressive trainer
  decoding/     — sequence generator (greedy, beam, sampling)
  utils/        — optimizer and LR scheduler factories

Requirements

  • Python >= 3.9
  • PyTorch >= 2.0
  • torchaudio
  • numpy, tqdm, wandb, torchmetrics, tokenizers, pandas, matplotlib

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

custom_transformer-0.1.0-cp313-cp313-win_amd64.whl (1.0 MB view details)

Uploaded CPython 3.13Windows x86-64

custom_transformer-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.4 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

custom_transformer-0.1.0-cp313-cp313-macosx_10_13_universal2.whl (2.1 MB view details)

Uploaded CPython 3.13macOS 10.13+ universal2 (ARM64, x86-64)

custom_transformer-0.1.0-cp312-cp312-win_amd64.whl (1.0 MB view details)

Uploaded CPython 3.12Windows x86-64

custom_transformer-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

custom_transformer-0.1.0-cp312-cp312-macosx_10_13_universal2.whl (2.1 MB view details)

Uploaded CPython 3.12macOS 10.13+ universal2 (ARM64, x86-64)

custom_transformer-0.1.0-cp311-cp311-win_amd64.whl (1.1 MB view details)

Uploaded CPython 3.11Windows x86-64

custom_transformer-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.4 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

custom_transformer-0.1.0-cp311-cp311-macosx_10_9_universal2.whl (2.1 MB view details)

Uploaded CPython 3.11macOS 10.9+ universal2 (ARM64, x86-64)

custom_transformer-0.1.0-cp310-cp310-win_amd64.whl (1.1 MB view details)

Uploaded CPython 3.10Windows x86-64

custom_transformer-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.0 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

custom_transformer-0.1.0-cp310-cp310-macosx_10_9_universal2.whl (2.1 MB view details)

Uploaded CPython 3.10macOS 10.9+ universal2 (ARM64, x86-64)

File details

Details for the file custom_transformer-0.1.0-cp313-cp313-win_amd64.whl.

File metadata

File hashes

Hashes for custom_transformer-0.1.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 e7be9e7b93604f7e07bc1ab60c5c29854bf6c3c54c6d5210b1a19bad0cea22d4
MD5 73a59c5d5d2c578da28b6ff2067e48dd
BLAKE2b-256 55b72a9dfa7df64db5f40b2453a77eb30f73a8acbc74347b197688053db6ecb2

See more details on using hashes here.

Provenance

The following attestation bundles were made for custom_transformer-0.1.0-cp313-cp313-win_amd64.whl:

Publisher: publish.yml on kkipngenokoech/custom-transformer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file custom_transformer-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for custom_transformer-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 ac2c498ede2db9e6c55be60e6017a60b094f060508bf9b0b96cae62abde6e51d
MD5 22215cbf1bc4137d17a3a01322d79849
BLAKE2b-256 4f1e829fc134260a87f4c2c3b69da67991b65baed83f5ca0bf6c09fa41441755

See more details on using hashes here.

Provenance

The following attestation bundles were made for custom_transformer-0.1.0-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: publish.yml on kkipngenokoech/custom-transformer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file custom_transformer-0.1.0-cp313-cp313-macosx_10_13_universal2.whl.

File metadata

File hashes

Hashes for custom_transformer-0.1.0-cp313-cp313-macosx_10_13_universal2.whl
Algorithm Hash digest
SHA256 e78e8ebc41d253b8f996a0d0fabae6e4069478866a25ede6aa24df58058f860c
MD5 b0fd844f646e00b8a70c31870c0958b8
BLAKE2b-256 692054d2f006759a15041b7b8a31be88d92b2ddec5f439fbda1d365184f1d557

See more details on using hashes here.

Provenance

The following attestation bundles were made for custom_transformer-0.1.0-cp313-cp313-macosx_10_13_universal2.whl:

Publisher: publish.yml on kkipngenokoech/custom-transformer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file custom_transformer-0.1.0-cp312-cp312-win_amd64.whl.

File metadata

File hashes

Hashes for custom_transformer-0.1.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 65c6b0f33c039ea53cc1070e874b6e67a5772f631acd7a295c1979837722f09b
MD5 01bb071c587808a2febf03de2c5efe92
BLAKE2b-256 2e546ffa6eb8589ce15cbe8279d196149044cf563b179afe44faf77b92c1aace

See more details on using hashes here.

Provenance

The following attestation bundles were made for custom_transformer-0.1.0-cp312-cp312-win_amd64.whl:

Publisher: publish.yml on kkipngenokoech/custom-transformer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file custom_transformer-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for custom_transformer-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d456f7ac73cec1584ccceb746a93d6d75aad77dcc7b6c0e4e682640c8bb7242b
MD5 eb84f1d8963720064e62290bf2694a94
BLAKE2b-256 c5e43e1d22164783e7f4dee4d25e85a139d672293bee68605835eef806accf5f

See more details on using hashes here.

Provenance

The following attestation bundles were made for custom_transformer-0.1.0-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: publish.yml on kkipngenokoech/custom-transformer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file custom_transformer-0.1.0-cp312-cp312-macosx_10_13_universal2.whl.

File metadata

File hashes

Hashes for custom_transformer-0.1.0-cp312-cp312-macosx_10_13_universal2.whl
Algorithm Hash digest
SHA256 5a94ed782428d43e65ba1624042ff955eeb34b4a13c684b7d85d43baf179110c
MD5 db05b058c57272a1d77e2d7819dac24e
BLAKE2b-256 8c045c366669128b8cd5ea47527cc3f785f592f336150a1852ca1b5c4c0321f8

See more details on using hashes here.

Provenance

The following attestation bundles were made for custom_transformer-0.1.0-cp312-cp312-macosx_10_13_universal2.whl:

Publisher: publish.yml on kkipngenokoech/custom-transformer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file custom_transformer-0.1.0-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for custom_transformer-0.1.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 028ebbf76e09c349252185523e93e46b6c1d64e323b948bcc1321ad91d332c7c
MD5 7a39cf8ba6def42459a1cfd3601582bd
BLAKE2b-256 208ed16619b522b47f31e4582bad1fcb2121fc0edecf891c0768a056973903e6

See more details on using hashes here.

Provenance

The following attestation bundles were made for custom_transformer-0.1.0-cp311-cp311-win_amd64.whl:

Publisher: publish.yml on kkipngenokoech/custom-transformer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file custom_transformer-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for custom_transformer-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 023c465c9847e010e85a306fe6a2eb96c1bcc3dbc07aaee75e8b0786c1c1e98f
MD5 44887ae35e90e8393cfe88cc953cfa77
BLAKE2b-256 da46b446b7a300fb13c88b2bd20f954b8fb150d3ca7f9e45328171fb227afe39

See more details on using hashes here.

Provenance

The following attestation bundles were made for custom_transformer-0.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: publish.yml on kkipngenokoech/custom-transformer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file custom_transformer-0.1.0-cp311-cp311-macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for custom_transformer-0.1.0-cp311-cp311-macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 dcec103ce64af1ec90cf2a5eeb0eac9f4de9ae8edf6ea2995f662286d7f28332
MD5 81a74c1a852312e9b66b8dae2c9d8b90
BLAKE2b-256 bc234576257bce46a04d8415c4026950897a4fbd48fb06b9af4c7e3e6ae29128

See more details on using hashes here.

Provenance

The following attestation bundles were made for custom_transformer-0.1.0-cp311-cp311-macosx_10_9_universal2.whl:

Publisher: publish.yml on kkipngenokoech/custom-transformer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file custom_transformer-0.1.0-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for custom_transformer-0.1.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 7429d7d1b051e84f7bef807b600534359e3991319b0e0f998400af149af60cf9
MD5 c8d1cf23cae4c6079705af530736b98b
BLAKE2b-256 66e8d37e48022222fda59a29933c9849dcd37d32b4a2c9587c9392a3843259bd

See more details on using hashes here.

Provenance

The following attestation bundles were made for custom_transformer-0.1.0-cp310-cp310-win_amd64.whl:

Publisher: publish.yml on kkipngenokoech/custom-transformer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file custom_transformer-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for custom_transformer-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a2721f066ff65e25d23b6b15c608be500ba3242ff47dfc7de2cbb4c0361d73e7
MD5 262236bcfca4e8aa2187bbe18b1ef266
BLAKE2b-256 18595ecf580c4c8b351d6954a2b621a7bbed24962786d90ef8f2c5388eb23743

See more details on using hashes here.

Provenance

The following attestation bundles were made for custom_transformer-0.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: publish.yml on kkipngenokoech/custom-transformer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file custom_transformer-0.1.0-cp310-cp310-macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for custom_transformer-0.1.0-cp310-cp310-macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 bbfd620e1d098e905715cbca765983eba485d59e03938db89f7f8d80d31ac59a
MD5 5c9d73719bafaf3127b8b68aa47c1747
BLAKE2b-256 a30e220b11c357c302733f81486ea545528e0be34f37f3ad4bf74ef063e86cc2

See more details on using hashes here.

Provenance

The following attestation bundles were made for custom_transformer-0.1.0-cp310-cp310-macosx_10_9_universal2.whl:

Publisher: publish.yml on kkipngenokoech/custom-transformer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page