Skip to main content

No project description provided

Project description

MSLK Logo

MSLK Library

MSLK (Meta Superintelligence Labs Kernels, formerly known as FBGEMM GenAI) is a collection of high-performance kernels and optimizations built on top of PyTorch primitives for GenAI training and inference.

Installation

# Install MSLK for CUDA
pip install mslk-cuda==1.0.0
# Install MSLK for ROCm
pip install mslk-rocm==1.0.0
# Install a nightly CUDA version
pip install --pre mslk --index-url https://download.pytorch.org/whl/nightly/cu128
# Install a nightly ROCm version
pip install --pre mslk --index-url https://download.pytorch.org/whl/nightly/rocm7.1/

Release Compatibility Table

MSLK is released in accordance to the PyTorch release schedule, and each release has no guarantee to work in conjunction with PyTorch releases that are older than the one that the MSLK release corresponds to.

MSLK Release Corresponding PyTorch Release Supported Python Versions Supported CUDA Versions Supported CUDA Architectures Supported ROCm Versions Supported ROCm Architectures
1.1.0 2.11.x 3.10, 3.11, 3.12, 3.13, 3.14 12.6, 12.8, 12.9, 13.0 8.0, 9.0a, 10.0a, 12.0a 7.0, 7.1 gfx908, gfx90a, gfx942, gfx950
1.0.0 2.10.x 3.10, 3.11, 3.12, 3.13, 3.14 12.6, 12.8, 12.9, 13.0 8.0, 9.0a, 10.0a, 12.0a 7.1, 7.2 gfx908, gfx90a, gfx942, gfx950

Note that the supported CUDA/ROCm Architectures refer to compiled C++ kernels. In addition, some kernels (e.g. CUTLASS/CK) would be specific to certain architectures. Python JIT DSL based kernels (e.g. Triton) would potentially work on wider variety of architectures.

Running Benchmarks

python bench/gemm/gemm_bench.py --M 4096 --N 4096 --K 4096
python bench/quantize/quantize_bench.py --M 4096 --K 4096
python bench/conv/conv_bench.py

Running Tests

pytest test/gemm/gemm_test.py
pytest test/quantize/fp8_quantize_correctness_test.py
pytest test/conv/conv_test.py

Build From Source

We only support building on Linux. See the release compatibility table above for supported versions of Python, CUDA, ROCm.

# Clone repo
git clone https://github.com/meta-pytorch/MSLK
cd MSLK
git submodule sync
git submodule update --init --recursive
# Build and install
# The script will create a conda environment and install the required dependencies.
# The conda environment will look something like: build-py3.14-torchnightly-cuda12.9.1
./ci/integration/mslk_oss_build.bash
# After the initial environment setup, you can activate the environment and iterate faster:
conda activate build-py3.14-torchnightly-cuda12.9.1
python setup.py install

Python-Only Build

If you don't need the C++/CUDA kernels (e.g. for testing Python only changes), you can install MSLK in Python-only mode by setting the MSLK_PYTHON_ONLY environment variable. This skips the C++/CUDA compilation entirely.

MSLK_PYTHON_ONLY=1 pip install -e .

Join the MSLK community

For questions, support, news updates, or feature requests, please feel free to:

For contributions, please see the CONTRIBUTING file for ways to help out.

License

MSLK is BSD licensed, as found in the LICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

mslk_cuda_nightly-2026.4.16-cp314-cp314-manylinux_2_28_x86_64.whl (46.5 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.16-cp313-cp313-manylinux_2_28_x86_64.whl (46.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.16-cp312-cp312-manylinux_2_28_x86_64.whl (48.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.16-cp311-cp311-manylinux_2_28_x86_64.whl (48.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.16-cp310-cp310-manylinux_2_28_x86_64.whl (48.0 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

File details

Details for the file mslk_cuda_nightly-2026.4.16-cp314-cp314-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.16-cp314-cp314-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 0d3f99c142feff921109a2117f46e45f991e3aa5c1c8b6dedb69afc4d1d59766
MD5 e061b925c760d75241b6a944142f0851
BLAKE2b-256 49a75e51ea4347aef94d19e54b4d0dc7f85da3aedcff0a58adb3fd3072269223

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.16-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.16-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 92b31ce6a523f61074f0dbff44cae68f61fb6b3a575ded195100b656db4c352c
MD5 b2cfb8546e8ec60cc7d65b30ad290cf9
BLAKE2b-256 b59fdfc047a26fe02bdcc71daeb12b6c9f165df1a29651653abf337c02895994

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.16-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.16-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 04285fa2468d509dcd7e3bc808c27c05ff89b784b1fadef0e351b99277461f36
MD5 fd16d38654fd6e6b6a26f9587352bd14
BLAKE2b-256 e2e183544af1574780823215a86d55319f01b90039524becf5c2bb424f7d547a

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.16-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.16-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 cf0f344bd1118af0159c4a92e40a8dd36f1bc46e10da1fe4da57654887aa0ca5
MD5 c9933ebc0c67ec324ad82a8d7375b217
BLAKE2b-256 3cbed8016775c2955086e51ef9f947df0492ab891c2bf46b059df87cd734eb4e

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.16-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.16-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 0c99b1d02459dc7bc667d679ea4d00a50dc3a2c0fbeb3e56692ffe9a024aa916
MD5 2a8f01526ed8babf4561e77d16048e70
BLAKE2b-256 d11864b159d330bfc2924ed44e6e4cfc5ac06b3eb9532c24170775c74a94c852

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page