Skip to main content

A fast, yet specialized, RMSNorm/LayerNorm implementation

Project description

# faster-norm

A fast, yet specialized, RMSNorm/LayerNorm implementation

This library is under development. Currently, only some special cases are supported, and the performance is not yet fully optimized.

  • [x] RMSNorm

  • [ ] LayerNorm

  • [x] Float16 and BFloat16

  • [ ] More data types

  • [x] More shapes

  • [x] Optimize for no wgrad

  • [ ] Performance tuning

## Statement

This work was independently completed by me at home using my personal RTX 3080.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

faster_norm-0.2.2.tar.gz (6.6 kB view details)

Uploaded Source

File details

Details for the file faster_norm-0.2.2.tar.gz.

File metadata

  • Download URL: faster_norm-0.2.2.tar.gz
  • Upload date:
  • Size: 6.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.11.8

File hashes

Hashes for faster_norm-0.2.2.tar.gz
Algorithm Hash digest
SHA256 2f7fb82d00037f987a3e8408ecadef5a87a6c28ac3ff24c89b4d0ae5add9c00e
MD5 d75b7646281b1360bf91fb6572b99f49
BLAKE2b-256 a9197800ade5c03b435a43401f51624b74640038e4b4faef4b89e421107514b9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page