Last released Aug 22, 2024
Accelerate transformer inference by 7.6-9x. Native to Huggingface and PyTorch.
Last released Jan 5, 2024
Deploy, scale, and monitor your ML models all with one click. Native to AWS.
Supported by