fastertransformer: fastertransformer tf op
Project description
fastertransformer: fastertransformer tf op
https://github.com/NVIDIA/FasterTransformer <br>
libtf_bert.so build for linux os
In NLP, encoder and decoder are two important components, with the transformer layer becoming a popular architecture for both components.
FasterTransformer implements a highly optimized transformer layer for both the encoder and decoder for inference. On Volta, Turing and Ampere GPUs,
the computing power of Tensor Cores are used automatically when the precision of the data and weights are FP16.
FasterTransformer is built on top of CUDA, cuBLAS, cuBLASLt and C++. We provide at least one API of the following frameworks: TensorFlow, PyTorch and Triton backend.
Users can integrate FasterTransformer into these frameworks directly.
For supporting frameworks, we also provide example codes to demonstrate how to use, and show the performance on these frameworks.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distribution
File details
Details for the file fastertransformer-5.0.0.116-py3-none-any.whl
.
File metadata
- Download URL: fastertransformer-5.0.0.116-py3-none-any.whl
- Upload date:
- Size: 16.9 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.8.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 79f640d3fb87afdd5302a8bd8c56e48b25b761a780dc70df5eff1a37a92ec5c8 |
|
MD5 | 7dcf3ec5702947db3d0e02442751bece |
|
BLAKE2b-256 | 88132dff7edc74527b5cf897ee251a3d8fce27477ac9f2b6fa605d842cdb8962 |