Skip to main content

A fast, yet specialized, RMSNorm/LayerNorm implementation

Reason this release was yanked:

Replaced by faster-norm.

Project description

# fast-norm-cuda

A fast, yet specialized, RMSNorm/LayerNorm implementation

This library is under development. Currently, only some special cases are supported, and the performance is not yet fully optimized.

  • [x] RMSNorm

  • [ ] LayerNorm

  • [ ] More shapes

  • [ ] Performance tuning

## Statement

This work was independently completed by me at home using my personal RTX 3080.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fast_norm_cuda-0.1.0.tar.gz (5.2 kB view details)

Uploaded Source

File details

Details for the file fast_norm_cuda-0.1.0.tar.gz.

File metadata

  • Download URL: fast_norm_cuda-0.1.0.tar.gz
  • Upload date:
  • Size: 5.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.8

File hashes

Hashes for fast_norm_cuda-0.1.0.tar.gz
Algorithm Hash digest
SHA256 17fbbf4fcc5b2a462b109021bd68321d23203588319b120e72e32d7fa53c4ee8
MD5 ba9578b95fedd0061ce0ab59543d31e4
BLAKE2b-256 a873fcd82ceed511f286099736c0be5d56944b33cb684917b8686a49b44b23aa

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page