SwitchTransformers - Pytorch
Project description
Switch Transformers
Implementation of Switch Transformers from the paper: "Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity" in PyTorch, Einops, and Zeta. PAPER LINK
Installation
pip install switch-transformers
Usage
import torch
from switch_transformers import SwitchTransformer
# Generate a random tensor of shape (1, 10) with values between 0 and 100
x = torch.randint(0, 100, (1, 10))
# Create an instance of the SwitchTransformer model
# num_tokens: the number of tokens in the input sequence
# dim: the dimensionality of the model
# heads: the number of attention heads
# dim_head: the dimensionality of each attention head
model = SwitchTransformer(
num_tokens=100, dim=512, heads=8, dim_head=64
)
# Pass the input tensor through the model
out = model(x)
# Print the shape of the output tensor
print(out.shape)
Citation
@misc{fedus2022switch,
title={Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity},
author={William Fedus and Barret Zoph and Noam Shazeer},
year={2022},
eprint={2101.03961},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
License
MIT
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for switch_transformers-0.0.4.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | e80972012db0ac1f73d922b31f5d0a05af4402a0859ca6bf03279ecf64f536fc |
|
MD5 | 7f3ef0705da38cbc8dd24822f4ae6e94 |
|
BLAKE2b-256 | 93549467b85567de5db2a86f0e2884361a8bc5acf6763873354c2b5398c55847 |
Close
Hashes for switch_transformers-0.0.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e57f21e358197e6e8347fb59703104b7a1c42bf2ccded4d95475614e145f7d1b |
|
MD5 | 213580cb65ce06649b86f7c091c1e972 |
|
BLAKE2b-256 | b8c6ea0db98bfc80cc07851dea77af3124a28e0cbede49375129362e8a5d714d |