Skip to main content

No project description provided

Project description

MSLK Logo

MSLK Library

MSLK (Meta Superintelligence Labs Kernels, formerly known as FBGEMM GenAI) is a collection of high-performance kernels and optimizations built on top of PyTorch primitives for GenAI training and inference.

Installation

# Install MSLK for CUDA
pip install mslk-cuda==1.0.0
# Install MSLK for ROCm
pip install mslk-rocm==1.0.0
# Install a nightly CUDA version
pip install --pre mslk --index-url https://download.pytorch.org/whl/nightly/cu128
# Install a nightly ROCm version
pip install --pre mslk --index-url https://download.pytorch.org/whl/nightly/rocm7.1/

Release Compatibility Table

MSLK is released in accordance to the PyTorch release schedule, and each release has no guarantee to work in conjunction with PyTorch releases that are older than the one that the MSLK release corresponds to.

MSLK Release Corresponding PyTorch Release Supported Python Versions Supported CUDA Versions Supported CUDA Architectures Supported ROCm Versions Supported ROCm Architectures
1.1.0 2.11.x 3.10, 3.11, 3.12, 3.13, 3.14 12.6, 12.8, 12.9, 13.0 8.0, 9.0a, 10.0a, 12.0a 7.0, 7.1 gfx908, gfx90a, gfx942, gfx950
1.0.0 2.10.x 3.10, 3.11, 3.12, 3.13, 3.14 12.6, 12.8, 12.9, 13.0 8.0, 9.0a, 10.0a, 12.0a 7.1, 7.2 gfx908, gfx90a, gfx942, gfx950

Note that the supported CUDA/ROCm Architectures refer to compiled C++ kernels. In addition, some kernels (e.g. CUTLASS/CK) would be specific to certain architectures. Python JIT DSL based kernels (e.g. Triton) would potentially work on wider variety of architectures.

Running Benchmarks

python bench/gemm/gemm_bench.py --M 4096 --N 4096 --K 4096
python bench/quantize/quantize_bench.py --M 4096 --K 4096
python bench/conv/conv_bench.py

Running Tests

pytest test/gemm/gemm_test.py
pytest test/quantize/fp8_quantize_correctness_test.py
pytest test/conv/conv_test.py

Build From Source

We only support building on Linux. See the release compatibility table above for supported versions of Python, CUDA, ROCm.

# Clone repo
git clone https://github.com/meta-pytorch/MSLK
cd MSLK
git submodule sync
git submodule update --init --recursive
# Build and install
# The script will create a conda environment and install the required dependencies.
# The conda environment will look something like: build-py3.14-torchnightly-cuda12.9.1
./ci/integration/mslk_oss_build.bash
# After the initial environment setup, you can activate the environment and iterate faster:
conda activate build-py3.14-torchnightly-cuda12.9.1
python setup.py install

Python-Only Build

If you don't need the C++/CUDA kernels (e.g. for testing Python only changes), you can install MSLK in Python-only mode by setting the MSLK_PYTHON_ONLY environment variable. This skips the C++/CUDA compilation entirely.

MSLK_PYTHON_ONLY=1 pip install -e .

Join the MSLK community

For questions, support, news updates, or feature requests, please feel free to:

For contributions, please see the CONTRIBUTING file for ways to help out.

License

MSLK is BSD licensed, as found in the LICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

mslk_cuda_nightly-2026.4.19-cp314-cp314-manylinux_2_28_x86_64.whl (46.5 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.19-cp313-cp313-manylinux_2_28_x86_64.whl (46.5 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.19-cp312-cp312-manylinux_2_28_x86_64.whl (46.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.19-cp311-cp311-manylinux_2_28_x86_64.whl (46.5 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.19-cp310-cp310-manylinux_2_28_x86_64.whl (46.5 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

File details

Details for the file mslk_cuda_nightly-2026.4.19-cp314-cp314-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.19-cp314-cp314-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 c45e29d56f40b5a4f1f40b8e9faecdd18c1b69201a85d1205185c166fa00a9d0
MD5 fb7eef281832761225ede7c8321ce202
BLAKE2b-256 643887cdfcc595888e2a3a9e3618d4d7b5576bf0328a4ff6ffa10967db977922

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.19-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.19-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 7d1f4c9d75cb0463f480b853a77878539e56988598ff7ef9a22e2905a9a4a268
MD5 86bf5530f864691dba9152cd8b15247b
BLAKE2b-256 518bdafe7b71f07ab776fcf256dde4c8d95d0af9303bb65ea2a5e6721260f489

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.19-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.19-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 ddf885bc7403c1800c7b518e6eb9d8168d4b10657403372f8d18e3e403830ce2
MD5 9b9a83ebe7f3220509575fbb3cb9580b
BLAKE2b-256 a432ebb76b7c30fb4bb521bcc8887ca800a47de87bc9979559d5449a335101fa

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.19-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.19-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 3ddd78111c5f4d8b3621d59cfc8841be78417c0ccc882c533539b287a1eb1f8d
MD5 6a38e8f65b2a04d482903333e4e7cf88
BLAKE2b-256 74a8c188d1b20eb3d23cb9744bfb338e03c0fcf3fe3215eb647a82b9dc96cc5e

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.19-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.19-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 d8de650961ff47e740a9874ccf3923afefa687da623ec1ac088a872831c3eb4d
MD5 2fe374e0e0075508493c48ad21d9f644
BLAKE2b-256 d84de615978c355ed96e81fd60ffc053f90f3ecc5aabed8cb37b75e240b49137

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page