Skip to main content

No project description provided

Project description

MSLK Logo

MSLK Library

MSLK (Meta Superintelligence Labs Kernels, formerly known as FBGEMM GenAI) is a collection of high-performance kernels and optimizations built on top of PyTorch primitives for GenAI training and inference.

Installation

# Install MSLK for CUDA
pip install mslk-cuda==1.0.0
# Install MSLK for ROCm
pip install mslk-rocm==1.0.0
# Install a nightly CUDA version
pip install --pre mslk --index-url https://download.pytorch.org/whl/nightly/cu128
# Install a nightly ROCm version
pip install --pre mslk --index-url https://download.pytorch.org/whl/nightly/rocm7.1/

Release Compatibility Table

MSLK is released in accordance to the PyTorch release schedule, and each release has no guarantee to work in conjunction with PyTorch releases that are older than the one that the MSLK release corresponds to.

MSLK Release Corresponding PyTorch Release Supported Python Versions Supported CUDA Versions Supported CUDA Architectures Supported ROCm Versions Supported ROCm Architectures
1.1.0 2.11.x 3.10, 3.11, 3.12, 3.13, 3.14 12.6, 12.8, 12.9, 13.0 8.0, 9.0a, 10.0a, 12.0a 7.0, 7.1 gfx908, gfx90a, gfx942, gfx950
1.0.0 2.10.x 3.10, 3.11, 3.12, 3.13, 3.14 12.6, 12.8, 12.9, 13.0 8.0, 9.0a, 10.0a, 12.0a 7.1, 7.2 gfx908, gfx90a, gfx942, gfx950

Note that the supported CUDA/ROCm Architectures refer to compiled C++ kernels. In addition, some kernels (e.g. CUTLASS/CK) would be specific to certain architectures. Python JIT DSL based kernels (e.g. Triton) would potentially work on wider variety of architectures.

Running Benchmarks

python bench/gemm/gemm_bench.py --M 4096 --N 4096 --K 4096
python bench/quantize/quantize_bench.py --M 4096 --K 4096
python bench/conv/conv_bench.py

Running Tests

pytest test/gemm/gemm_test.py
pytest test/quantize/fp8_quantize_correctness_test.py
pytest test/conv/conv_test.py

Build From Source

We only support building on Linux. See the release compatibility table above for supported versions of Python, CUDA, ROCm.

# Clone repo
git clone https://github.com/meta-pytorch/MSLK
cd MSLK
git submodule sync
git submodule update --init --recursive
# Build and install
# The script will create a conda environment and install the required dependencies.
# The conda environment will look something like: build-py3.14-torchnightly-cuda12.9.1
./ci/integration/mslk_oss_build.bash
# After the initial environment setup, you can activate the environment and iterate faster:
conda activate build-py3.14-torchnightly-cuda12.9.1
python setup.py install

Python-Only Build

If you don't need the C++/CUDA kernels (e.g. for testing Python only changes), you can install MSLK in Python-only mode by setting the MSLK_PYTHON_ONLY environment variable. This skips the C++/CUDA compilation entirely.

MSLK_PYTHON_ONLY=1 pip install -e .

Join the MSLK community

For questions, support, news updates, or feature requests, please feel free to:

For contributions, please see the CONTRIBUTING file for ways to help out.

License

MSLK is BSD licensed, as found in the LICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

mslk_cuda_nightly-2026.4.17-cp314-cp314-manylinux_2_28_x86_64.whl (46.5 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.17-cp313-cp313-manylinux_2_28_x86_64.whl (48.0 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.17-cp312-cp312-manylinux_2_28_x86_64.whl (46.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.17-cp311-cp311-manylinux_2_28_x86_64.whl (46.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.17-cp310-cp310-manylinux_2_28_x86_64.whl (46.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

File details

Details for the file mslk_cuda_nightly-2026.4.17-cp314-cp314-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.17-cp314-cp314-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 0c287bc8a80ddd6547e3d79c4b9b77804974a368834e05a52563350969151ca1
MD5 1c605be500a60d1aaed5383a95413b42
BLAKE2b-256 f47c46e725cd152fd1107f09baad1a9030d4db198bd14c87284408879eb7fe0a

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.17-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.17-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 98d64ae3caf980d48ad32025c13e613bb607b059fe07d924dea28bb7c932db93
MD5 b88e4915d34e92af594a56542725f9a4
BLAKE2b-256 7b37dc8e72d3872379f926f37b80eb58fe53010e14995eee0da17c124589c4e4

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.17-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.17-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 5ea4c9db4501cac4d05e8380e4db55a2c514121f567623580afd1f178e27623e
MD5 9e311dfe89cd53a56a1ccc11e2da6e2b
BLAKE2b-256 409ddeb5d1f67aee30b5dc383a924c837c4ec4eb8ba79df34d6b7f71edf24697

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.17-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.17-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 dfa64914fae35de0a8e375cb4b7943232a13d3a2a3c98526d465b74aa0ae8c2e
MD5 ff8f7f54b38176b4123b1989a4c958cb
BLAKE2b-256 3f157110fa54697c2585853ef670ee41cd4dd1138235fb0f906ad3edef441bfe

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.17-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.17-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 67432612eabbfa9c8e22c154e3be33ef272175df82dc44d4352517509df443f5
MD5 3836b6adb444a29bb2d06344264c0914
BLAKE2b-256 ddf6630e4eef33ea7256cf19eb4d84ab1843bcd1896f412888de6c746332166f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page