Skip to main content

A fast, yet specialized, RMSNorm/LayerNorm implementation

Project description

# faster-norm

A fast, yet specialized, RMSNorm/LayerNorm implementation

This library is under development. Currently, only some special cases are supported, and the performance is not yet fully optimized.

  • [x] RMSNorm

  • [x] LayerNorm

  • [x] Float16 and BFloat16

  • [ ] More data types

  • [x] More shapes

  • [x] Optimize for no wgrad

  • [ ] Performance tuning

  • [ ] Optimize compilation time

## Statement

This work was independently completed by me at home using my personal RTX 3080.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

faster_norm-0.3.0.tar.gz (7.4 kB view details)

Uploaded Source

File details

Details for the file faster_norm-0.3.0.tar.gz.

File metadata

  • Download URL: faster_norm-0.3.0.tar.gz
  • Upload date:
  • Size: 7.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.6

File hashes

Hashes for faster_norm-0.3.0.tar.gz
Algorithm Hash digest
SHA256 7b87cc4b65156d91dea645dcc706301ffa4b002ba64530e8d3215023ceff0449
MD5 2c00066e4eae6cb58e226f8037ad9132
BLAKE2b-256 aef20d8e206728c142aa3f9b8a92fc13b066bedaf5c1ef6830362141a6a6001b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page