Skip to main content

Modular transformer blocks built in PyTorch

Project description

🧱 Stackformer

Stackformer is a modular transformer-building framework written entirely in PyTorch. It is designed primarily for experimentation, providing various transformer blocks such as attention mechanisms, normalization layers, feed-forward networks, and a simple model architecture. The project is a work-in-progress with plans for further enhancements and expansions.


📖 About Me

My name is Gurumurthy, and I am a final-year Bachelor of Engineering student from India. I created this library as my own size project to showcase my skills and knowledge in deep learning and transformer architectures.

I am also interested and free to work with others on different projects for knowledge sharing and building connections.


🌟 Features

  • Multiple attention mechanisms including multi-head, group query, linear, local, and KV cache variants
  • Token embedding via tiktoken
  • Absolute and sinusoidal positional embeddings
  • Normalization layers like LayerNorm and RMSNorm
  • Several feed-forward network variants with activations such as ReLU, GELU, SiLU, LeakyReLU, and Sigmoid
  • A simple GPT-style transformer model implementation

📁 Project Structure

stackformer/
|-- modules/
| |-- tokenizer.py # Token embedding using tiktoken
| |-- position_embedding.py # Absolute and sinusoidal embeddings
| |-- Attention.py # Attention mechanisms
| |-- Normalization.py # LayerNorm and RMSNorm
| |-- Feed_forward.py # Feed-forward layers with various activations
|-- models/
| -- GPT_2.py # GPT-style transformer stack model
-- trainer.py # Training loop and utilities \


💻 Installation

✅ Method 1: Install from PyPI:

pip install Stackformer
import stackformer

🔧 Method 2: Clone the repository:

git clone https://github.com/Gurumurthy30/Stackformer
cd Stackformer
pip install -e .

🚀 Future Plans

Currently, I am working on improving and optimizing the existing components while fixing known bugs and issues. After stabilizing the current modules, I plan to add more advanced blocks like Mixture of Experts (MoE), mask handling, and other essential transformer components. Eventually, I will expand the library by developing more comprehensive model architectures.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stackformer-0.1.2.tar.gz (16.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

stackformer-0.1.2-py3-none-any.whl (21.6 kB view details)

Uploaded Python 3

File details

Details for the file stackformer-0.1.2.tar.gz.

File metadata

  • Download URL: stackformer-0.1.2.tar.gz
  • Upload date:
  • Size: 16.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.10

File hashes

Hashes for stackformer-0.1.2.tar.gz
Algorithm Hash digest
SHA256 7f37b11c5e6bf1c80920f77aa192f69f2cdd36063742baf5bdeae17a70182d33
MD5 8d4181286afeb82ec11bc23d3108e0fb
BLAKE2b-256 273fb1de85430490d62f004254757c4453e4beb1b219fe18cb13951e306924a2

See more details on using hashes here.

File details

Details for the file stackformer-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: stackformer-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 21.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.10

File hashes

Hashes for stackformer-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 6e3c747b52bbce1cd037c7cdea3c9f21e67919517080874f3448011f7798e1ef
MD5 3c5c446de2b63232dec8fe9a4fe9d0a5
BLAKE2b-256 ce39ef87f982677c66a91eaff793d6c5ee05d013d348ec16485a4c66560d7b37

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page