Skip to main content

No project description provided

Project description

MSLK Logo

MSLK Library

MSLK (Meta Superintelligence Labs Kernels, formerly known as FBGEMM GenAI) is a collection of high-performance kernels and optimizations built on top of PyTorch primitives for GenAI training and inference.

Installation

# Install MSLK for CUDA
pip install mslk-cuda==1.0.0
# Install MSLK for ROCm
pip install mslk-rocm==1.0.0
# Install a nightly CUDA version
pip install --pre mslk --index-url https://download.pytorch.org/whl/nightly/cu128
# Install a nightly ROCm version
pip install --pre mslk --index-url https://download.pytorch.org/whl/nightly/rocm7.1/

Release Compatibility Table

MSLK is released in accordance to the PyTorch release schedule, and each release has no guarantee to work in conjunction with PyTorch releases that are older than the one that the MSLK release corresponds to.

MSLK Release Corresponding PyTorch Release Supported Python Versions Supported CUDA Versions Supported CUDA Architectures Supported ROCm Versions Supported ROCm Architectures
1.1.0 2.11.x 3.10, 3.11, 3.12, 3.13, 3.14 12.6, 12.8, 12.9, 13.0 8.0, 9.0a, 10.0a, 12.0a 7.0, 7.1 gfx908, gfx90a, gfx942, gfx950
1.0.0 2.10.x 3.10, 3.11, 3.12, 3.13, 3.14 12.6, 12.8, 12.9, 13.0 8.0, 9.0a, 10.0a, 12.0a 7.1, 7.2 gfx908, gfx90a, gfx942, gfx950

Note that the supported CUDA/ROCm Architectures refer to compiled C++ kernels. In addition, some kernels (e.g. CUTLASS/CK) would be specific to certain architectures. Python JIT DSL based kernels (e.g. Triton) would potentially work on wider variety of architectures.

Running Benchmarks

python bench/gemm/gemm_bench.py --M 4096 --N 4096 --K 4096
python bench/quantize/quantize_bench.py --M 4096 --K 4096
python bench/conv/conv_bench.py

Running Tests

pytest test/gemm/gemm_test.py
pytest test/quantize/fp8_quantize_correctness_test.py
pytest test/conv/conv_test.py

Build From Source

We only support building on Linux. See the release compatibility table above for supported versions of Python, CUDA, ROCm.

# Clone repo
git clone https://github.com/meta-pytorch/MSLK
cd MSLK
git submodule sync
git submodule update --init --recursive
# Build and install
# The script will create a conda environment and install the required dependencies.
# The conda environment will look something like: build-py3.14-torchnightly-cuda12.9.1
./ci/integration/mslk_oss_build.bash
# After the initial environment setup, you can activate the environment and iterate faster:
conda activate build-py3.14-torchnightly-cuda12.9.1
python setup.py install

Python-Only Build

If you don't need the C++/CUDA kernels (e.g. for testing Python only changes), you can install MSLK in Python-only mode by setting the MSLK_PYTHON_ONLY environment variable. This skips the C++/CUDA compilation entirely.

MSLK_PYTHON_ONLY=1 pip install -e .

Join the MSLK community

For questions, support, news updates, or feature requests, please feel free to:

For contributions, please see the CONTRIBUTING file for ways to help out.

License

MSLK is BSD licensed, as found in the LICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

mslk_cuda_nightly-2026.4.18-cp314-cp314-manylinux_2_28_x86_64.whl (46.5 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.18-cp313-cp313-manylinux_2_28_x86_64.whl (46.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.18-cp312-cp312-manylinux_2_28_x86_64.whl (48.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.18-cp311-cp311-manylinux_2_28_x86_64.whl (48.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.18-cp310-cp310-manylinux_2_28_x86_64.whl (48.0 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

File details

Details for the file mslk_cuda_nightly-2026.4.18-cp314-cp314-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.18-cp314-cp314-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 18ae4e837771d18c3edab140ee40f07e76d84f1de226cb6d34568bbb0dc8a0a2
MD5 1dad669f46c9097a00ab115163c76775
BLAKE2b-256 487fb1d1a10a815daadac6299c86bdc3ef21575b0b416b3be50afb136ec18d90

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.18-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.18-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 25421aad3eafc59ff789075b8d44d56ec860cab5dcdc24ac0dbbaf10d583de1c
MD5 df81bad78def5d06f5332bbd83974ce9
BLAKE2b-256 dcce5cf21943974c495a13df9c7c9a77e9b9b89e5a1964a454773f5150b6f75c

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.18-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.18-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 876149264a8b78c906160ceab22d8b009295ee0e9c7a84c3c12246e98f7be9b6
MD5 a9dc54483964379418502657fdd89a18
BLAKE2b-256 71728bb36d5e1f79554b5a1c738d1095a4d0221483eeb4cb075821006a6a90e2

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.18-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.18-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 ec48d6be0b7864c5b3cc3820d4730c624506cbd863088e0203c00a9039bee3a6
MD5 46f5f93d41a692c1cb078715304369f2
BLAKE2b-256 b0d6fba6491bf0806e7ced422594405de60fb1ec3b4b57971bec3e790b4547e9

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.18-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.18-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 2158fabb8f09ba62a8dd131f1f68f53fcfe141855d6fcb17a1addbacde25de2e
MD5 3c59d547537a0a5b67c9494ca5fe4fdb
BLAKE2b-256 8abfe58c5ea2f41f6d77ed43c361e73e79d023f86cbee2bec2026204f5880535

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page