Skip to main content

No project description provided

Project description

MSLK Logo

MSLK Library

MSLK (Meta Superintelligence Labs Kernels, formerly known as FBGEMM GenAI) is a collection of high-performance kernels and optimizations built on top of PyTorch primitives for GenAI training and inference.

Installation

# Install MSLK for CUDA
pip install mslk-cuda==1.0.0
# Install MSLK for ROCm
pip install mslk-rocm==1.0.0
# Install a nightly CUDA version
pip install --pre mslk --index-url https://download.pytorch.org/whl/nightly/cu128
# Install a nightly ROCm version
pip install --pre mslk --index-url https://download.pytorch.org/whl/nightly/rocm7.1/

Release Compatibility Table

MSLK is released in accordance to the PyTorch release schedule, and each release has no guarantee to work in conjunction with PyTorch releases that are older than the one that the MSLK release corresponds to.

MSLK Release Corresponding PyTorch Release Supported Python Versions Supported CUDA Versions Supported CUDA Architectures Supported ROCm Versions Supported ROCm Architectures
1.1.0 2.11.x 3.10, 3.11, 3.12, 3.13, 3.14 12.6, 12.8, 12.9, 13.0 8.0, 9.0a, 10.0a, 12.0a 7.0, 7.1 gfx908, gfx90a, gfx942, gfx950
1.0.0 2.10.x 3.10, 3.11, 3.12, 3.13, 3.14 12.6, 12.8, 12.9, 13.0 8.0, 9.0a, 10.0a, 12.0a 7.1, 7.2 gfx908, gfx90a, gfx942, gfx950

Note that the supported CUDA/ROCm Architectures refer to compiled C++ kernels. In addition, some kernels (e.g. CUTLASS/CK) would be specific to certain architectures. Python JIT DSL based kernels (e.g. Triton) would potentially work on wider variety of architectures.

Running Benchmarks

python bench/gemm/gemm_bench.py --M 4096 --N 4096 --K 4096
python bench/quantize/quantize_bench.py --M 4096 --K 4096
python bench/conv/conv_bench.py

Running Tests

pytest test/gemm/gemm_test.py
pytest test/quantize/fp8_quantize_correctness_test.py
pytest test/conv/conv_test.py

Build From Source

We only support building on Linux. See the release compatibility table above for supported versions of Python, CUDA, ROCm.

# Clone repo
git clone https://github.com/meta-pytorch/MSLK
cd MSLK
git submodule sync
git submodule update --init --recursive
# Build and install
# The script will create a conda environment and install the required dependencies.
# The conda environment will look something like: build-py3.14-torchnightly-cuda12.9.1
./ci/integration/mslk_oss_build.bash
# After the initial environment setup, you can activate the environment and iterate faster:
conda activate build-py3.14-torchnightly-cuda12.9.1
python setup.py install

Python-Only Build

If you don't need the C++/CUDA kernels (e.g. for testing Python only changes), you can install MSLK in Python-only mode by setting the MSLK_PYTHON_ONLY environment variable. This skips the C++/CUDA compilation entirely.

MSLK_PYTHON_ONLY=1 pip install -e .

Join the MSLK community

For questions, support, news updates, or feature requests, please feel free to:

For contributions, please see the CONTRIBUTING file for ways to help out.

License

MSLK is BSD licensed, as found in the LICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

mslk_cuda_nightly-2026.4.8-cp314-cp314-manylinux_2_28_x86_64.whl (47.6 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.8-cp313-cp313-manylinux_2_28_x86_64.whl (46.0 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.8-cp312-cp312-manylinux_2_28_x86_64.whl (47.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.8-cp311-cp311-manylinux_2_28_x86_64.whl (47.6 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.8-cp310-cp310-manylinux_2_28_x86_64.whl (46.0 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

File details

Details for the file mslk_cuda_nightly-2026.4.8-cp314-cp314-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.8-cp314-cp314-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 15aafc8529dc966d2b931b6c2dc755b3eaea907b9dc0f3fd488b52a856c4ed83
MD5 f541ac21525ab0a7e595a7b363c6b6e7
BLAKE2b-256 37bbd3e2be7bf27fcb0341abdc11a851288223751cbe5daa6e48e326e7d88bd5

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.8-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.8-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 e2e9e40f2a889db7b572b3266ee159490a89e0c75d0289dd49cf7262939bfcac
MD5 cd1293f804a0c98f971b392e59982d11
BLAKE2b-256 6bf7b49226d77d8b815f75b8d449afa32ea61f6cda842512f6f3424bddf682f0

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.8-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.8-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 9bd999b997716b3877f553654de169483dd799b6165848e70caa7e6da172586c
MD5 47ee0efab0238000c80faa26dc4b2d9f
BLAKE2b-256 cb36ae020f9c382be05c6498ef1cec69b0b1d77aaca73145c95e35d98deeb2ba

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.8-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.8-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 655a23cb6ffcf59b93885f47157f497815ecc5633a35ecca8114eefcd473614f
MD5 e39472b38ba03ab9acef50802224b963
BLAKE2b-256 a658fc503a4ad74c47004defc3dbe621c9587a06848b86a9a8bd006760cf97a5

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.8-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.8-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 4612a15afbcf18386b87bdbbd8236c48223a8f8e76ffd0dff4355e96e67df971
MD5 a925943ca0669a99ba6c207a1c736c44
BLAKE2b-256 1b8792b39067ce431321c095e4651eee58903748514e1aef7b29ac59516ef9b3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page