This package provides builder-like API to create really flexible transformers using PyTorch
Project description
This project is incomplete!
Transformer Builder - Create Custom Transformer Models with Ease
Transformers have become a popular choice for a wide range of Natural Language Processing (NLP) and deep learning tasks. The Transformer Builder package allows you to create custom transformer models with ease, providing flexibility and modularity for your deep learning projects.
Features
- Build custom transformer models with a user-friendly and flexible interface.
- Configurable encoder and decoder blocks with support for custom self-attention mechanisms.
- Encapsulated self-attention blocks that adapt to your specific use case.
- Create encoder and decoder blocks for a wide range of NLP tasks.
- Open-source and customizable to fit your project's requirements.
Installation
You can install Transformer Builder using pip:
pip install transformer-builder
Usage
Here's an example of how to use Transformer Builder to create a custom model:
import torch
from torch import nn
from transformer_builder import TransformerBuilder, SelfAttentionBlock
vocabulary_size = 64_000
max_sequence_length = 1024
embedding_dimension = 100
builder = TransformerBuilder(
vocabulary_size=vocabulary_size,
max_sequence_length=max_sequence_length,
embedding_dimension=embedding_dimension,
default_decoder_block=None,
default_encoder_block=None,
default_positional_encoding=PositionalEncoding
)
gpt_model = (
builder
.add_embedding(nn.Embedding(vocabulary_size, embedding_dimension))
.add_positional_encoding(nn.Parameter(torch.zeros(1, max_sequence_length, embedding_dimension)))
.add_decoder_block(
nn.Linear(embedding_dimension, embedding_dimension),
# You can use specific implementation of self-attention block with classes:
# EncoderSelfAttentionBlock and DecoderSelfAttentionBlock.
SelfAttentionBlock( # Uses polymorphism to create different implementations of self-attention block.
before=nn.Sequential(
nn.Linear(embedding_dimension, embedding_dimension * 4),
nn.Linear(embedding_dimension * 4, embedding_dimension),
),
k=nn.Sequential(
nn.Linear(embedding_dimension, embedding_dimension * 2),
nn.Linear(embedding_dimension * 2, embedding_dimension),
),
q=builder.decoder_block(), # You can even pass whole decoder block into self-attention block!
# The standard decoder_block implementation is copied from GPT-1.
# You can change that to your needs by monkey-patching TransformerBuilder or subclassing it.
# Default values for kqv are nn.Linear(embedding_dimension, embedding_dimension),
# Default values for `before` and `after` is None which means it won't affect architecture.
count=10 # Notice that number of heads should be a divisor for embedding dimension.
),
count=10
)
)
Customization
With Transformer Builder, you can customize each aspect of your blocks individually, allowing for fine-grained control over your model's architecture. The example above demonstrates how to configure the self-attention layer, layer normalization, and linear layers.
Contributing
If you would like to contribute to this project, please follow our contribution guidelines.
Support and Feedback
If you have questions, encounter issues, or have feedback, please open an issue on our GitHub repository.
Acknowledgments
This project was inspired by the need for a flexible and customizable API for creating decoder blocks in deep learning models.
Author
License
This project is licensed under the MIT License. See the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for transformer_builder-0.0.1a0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | c899955de03f16b79054ca2ba281c75b19bd8a8af2a1df9b60946413e0aff0de |
|
MD5 | 9681c0cbc89157a2ff182e8da6dc835f |
|
BLAKE2b-256 | 58ecf7f27edff774fcf058f3add172ae9926e1e314a7f6bf5a8347395b15613a |
Hashes for transformer_builder-0.0.1a0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | bf66302fbe459ac6ddd3492a6ec8d197c819daded443b4fe38f9fd36f5078be7 |
|
MD5 | 309ddd6384b8c5d555e215089fee05c0 |
|
BLAKE2b-256 | a2dcb0cf6ee0a650dc9082793cc8236919b602362db4489c603b1db8b041b30c |