Skip to main content

No project description provided

Project description

MSLK Logo

MSLK Library

MSLK (Meta Superintelligence Labs Kernels, formerly known as FBGEMM GenAI) is a collection of high-performance kernels and optimizations built on top of PyTorch primitives for GenAI training and inference.

Installation

# Install MSLK for CUDA
pip install mslk-cuda==1.0.0
# Install MSLK for ROCm
pip install mslk-rocm==1.0.0
# Install a nightly CUDA version
pip install --pre mslk --index-url https://download.pytorch.org/whl/nightly/cu128
# Install a nightly ROCm version
pip install --pre mslk --index-url https://download.pytorch.org/whl/nightly/rocm7.1/

Release Compatibility Table

MSLK is released in accordance to the PyTorch release schedule, and each release has no guarantee to work in conjunction with PyTorch releases that are older than the one that the MSLK release corresponds to.

MSLK Release Corresponding PyTorch Release Supported Python Versions Supported CUDA Versions Supported CUDA Architectures Supported ROCm Versions Supported ROCm Architectures
1.1.0 2.11.x 3.10, 3.11, 3.12, 3.13, 3.14 12.6, 12.8, 12.9, 13.0 8.0, 9.0a, 10.0a, 12.0a 7.0, 7.1 gfx908, gfx90a, gfx942, gfx950
1.0.0 2.10.x 3.10, 3.11, 3.12, 3.13, 3.14 12.6, 12.8, 12.9, 13.0 8.0, 9.0a, 10.0a, 12.0a 7.1, 7.2 gfx908, gfx90a, gfx942, gfx950

Note that the supported CUDA/ROCm Architectures refer to compiled C++ kernels. In addition, some kernels (e.g. CUTLASS/CK) would be specific to certain architectures. Python JIT DSL based kernels (e.g. Triton) would potentially work on wider variety of architectures.

Running Benchmarks

python bench/gemm/gemm_bench.py --M 4096 --N 4096 --K 4096
python bench/quantize/quantize_bench.py --M 4096 --K 4096
python bench/conv/conv_bench.py

Running Tests

pytest test/gemm/gemm_test.py
pytest test/quantize/fp8_quantize_correctness_test.py
pytest test/conv/conv_test.py

Build From Source

We only support building on Linux. See the release compatibility table above for supported versions of Python, CUDA, ROCm.

# Clone repo
git clone https://github.com/meta-pytorch/MSLK
cd MSLK
git submodule sync
git submodule update --init --recursive
# Build and install
# The script will create a conda environment and install the required dependencies.
# The conda environment will look something like: build-py3.14-torchnightly-cuda12.9.1
./ci/integration/mslk_oss_build.bash
# After the initial environment setup, you can activate the environment and iterate faster:
conda activate build-py3.14-torchnightly-cuda12.9.1
python setup.py install

Python-Only Build

If you don't need the C++/CUDA kernels (e.g. for testing Python only changes), you can install MSLK in Python-only mode by setting the MSLK_PYTHON_ONLY environment variable. This skips the C++/CUDA compilation entirely.

MSLK_PYTHON_ONLY=1 pip install -e .

Join the MSLK community

For questions, support, news updates, or feature requests, please feel free to:

For contributions, please see the CONTRIBUTING file for ways to help out.

License

MSLK is BSD licensed, as found in the LICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

mslk_cuda_nightly-2026.4.6-cp314-cp314-manylinux_2_28_x86_64.whl (46.0 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.6-cp313-cp313-manylinux_2_28_x86_64.whl (46.0 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.6-cp312-cp312-manylinux_2_28_x86_64.whl (47.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.6-cp311-cp311-manylinux_2_28_x86_64.whl (46.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.6-cp310-cp310-manylinux_2_28_x86_64.whl (47.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

File details

Details for the file mslk_cuda_nightly-2026.4.6-cp314-cp314-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.6-cp314-cp314-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 0ce346a4b716202fc5088eb5f91add7404d398277426503a14837a59dc2abeea
MD5 b3b50f7b554397e7cabc808802e0ec2e
BLAKE2b-256 2c13745b07308eb9a292ed54ad6ea04c409d8de53554c0389bbb1fc7ace5df87

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.6-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.6-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 70310adb235adc1da8737bae6f538f6631420ade00ae38ae1ccb33621af0d6c9
MD5 5a9b7d0ef6904f2ae8e28091fbef9b88
BLAKE2b-256 84e36f894c1eb73a88cd86ba8fa3974f54637f3b044052fd073fafaf6ba45e63

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.6-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.6-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 ac2a34ff09a25fe7a3d6eb09a023f43ff1fcb15a3a6e0dcf0cba98c71e0dc692
MD5 f6961c21400ffc8704c7c3e81b1bdb73
BLAKE2b-256 30366d8e4c027007522067ebc21950ce4b9dc87d6157dc38d82ec9e2f193c5bb

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.6-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.6-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 bff85bc30ae1565ca82571294eb24d90d16d44ffb138f3e93c56dd5df1dabd23
MD5 8d4156a292c50bbd04e6b6b30ea708b5
BLAKE2b-256 399a164d9095802c838f7d534cbe39027eda627424542fccf1873023d4b30265

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.6-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.6-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 ab7e7d727989cfeaf91cb37aac14267b90dfe114d9b1d9a50e39d4a83bf908c5
MD5 74f0866ffafa7f3e023ca64134882293
BLAKE2b-256 3ad3c36509d4e9fb7dcac57f33c13740d7a8cd401554df11e260b0f9526ef5fd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page