differential-transformer - Pytorch
Project description
Differential Transformer
An open source community implementation of the model from "DIFFERENTIAL TRANSFORMER" paper by Microsoft. Paper Link
Install
$ pip3 install differential-transformers
Usage Transformer
import torch
from differential_transformer.main import DifferentialTransformer
from loguru import logger
# Example usage:
# Example dimensions
batch_size = 32
seq_len = 128
embedding_dim = 64
h = 8
λ = 0.1
λinit = 0.05
# Create random input tensor
x = torch.randint(0, 256, (1, 1024))
# Instantiate and run the multi-head attention
multi_head = DifferentialTransformer(heads=h, dim=embedding_dim, λinit=λinit)
output = multi_head(x, λ=λ)
logger.info(f"Output shape: {output.shape}")
License
MIT
Citation
@misc{ye2024differentialtransformer,
title={Differential Transformer},
author={Tianzhu Ye and Li Dong and Yuqing Xia and Yutao Sun and Yi Zhu and Gao Huang and Furu Wei},
year={2024},
eprint={2410.05258},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2410.05258},
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for differential_transformer-0.0.2.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4db1c21f9552b144e7ebd6e6a1e6333d16b50206c8c34e7ced7d883f137ef642 |
|
MD5 | 43f568ccb6cca1e860dcd5331d08e847 |
|
BLAKE2b-256 | ea8ea5cd95fff78a6ef194b984f76f33fbc25a4ac4f770fe481e305cf117e717 |
Close
Hashes for differential_transformer-0.0.2-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e4735f04cbce43fc787f1ac6922c63c17cc97239ac1849f01a14c17288ffdff1 |
|
MD5 | 024df32061e3b5044d36048f47d89bde |
|
BLAKE2b-256 | fa3570242dc164c238c9cbdbf219780bbb590b1d1d073c8155a78c745e7cb846 |