Skip to main content

No project description provided

Project description

MSLK Logo

MSLK Library

MSLK (Meta Superintelligence Labs Kernels, formerly known as FBGEMM GenAI) is a collection of high-performance kernels and optimizations built on top of PyTorch primitives for GenAI training and inference.

Installation

# Install MSLK for CUDA
pip install mslk-cuda==1.0.0
# Install MSLK for ROCm
pip install mslk-rocm==1.0.0
# Install a nightly CUDA version
pip install --pre mslk --index-url https://download.pytorch.org/whl/nightly/cu128
# Install a nightly ROCm version
pip install --pre mslk --index-url https://download.pytorch.org/whl/nightly/rocm7.1/

Release Compatibility Table

MSLK is released in accordance to the PyTorch release schedule, and each release has no guarantee to work in conjunction with PyTorch releases that are older than the one that the MSLK release corresponds to.

MSLK Release Corresponding PyTorch Release Supported Python Versions Supported CUDA Versions Supported CUDA Architectures Supported ROCm Versions Supported ROCm Architectures
1.1.0 2.11.x 3.10, 3.11, 3.12, 3.13, 3.14 12.6, 12.8, 12.9, 13.0 8.0, 9.0a, 10.0a, 12.0a 7.0, 7.1 gfx908, gfx90a, gfx942, gfx950
1.0.0 2.10.x 3.10, 3.11, 3.12, 3.13, 3.14 12.6, 12.8, 12.9, 13.0 8.0, 9.0a, 10.0a, 12.0a 7.1, 7.2 gfx908, gfx90a, gfx942, gfx950

Note that the supported CUDA/ROCm Architectures refer to compiled C++ kernels. In addition, some kernels (e.g. CUTLASS/CK) would be specific to certain architectures. Python JIT DSL based kernels (e.g. Triton) would potentially work on wider variety of architectures.

Running Benchmarks

python bench/gemm/gemm_bench.py --M 4096 --N 4096 --K 4096
python bench/quantize/quantize_bench.py --M 4096 --K 4096
python bench/conv/conv_bench.py

Running Tests

pytest test/gemm/gemm_test.py
pytest test/quantize/fp8_quantize_correctness_test.py
pytest test/conv/conv_test.py

Build From Source

We only support building on Linux. See the release compatibility table above for supported versions of Python, CUDA, ROCm.

# Clone repo
git clone https://github.com/meta-pytorch/MSLK
cd MSLK
git submodule sync
git submodule update --init --recursive
# Build and install
# The script will create a conda environment and install the required dependencies.
# The conda environment will look something like: build-py3.14-torchnightly-cuda12.9.1
./ci/integration/mslk_oss_build.bash
# After the initial environment setup, you can activate the environment and iterate faster:
conda activate build-py3.14-torchnightly-cuda12.9.1
python setup.py install

Python-Only Build

If you don't need the C++/CUDA kernels (e.g. for testing Python only changes), you can install MSLK in Python-only mode by setting the MSLK_PYTHON_ONLY environment variable. This skips the C++/CUDA compilation entirely.

MSLK_PYTHON_ONLY=1 pip install -e .

Join the MSLK community

For questions, support, news updates, or feature requests, please feel free to:

For contributions, please see the CONTRIBUTING file for ways to help out.

License

MSLK is BSD licensed, as found in the LICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

mslk_cuda_nightly-2026.4.15-cp314-cp314-manylinux_2_28_x86_64.whl (46.5 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.15-cp313-cp313-manylinux_2_28_x86_64.whl (48.0 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.15-cp312-cp312-manylinux_2_28_x86_64.whl (46.5 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.15-cp311-cp311-manylinux_2_28_x86_64.whl (48.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.15-cp310-cp310-manylinux_2_28_x86_64.whl (48.0 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

File details

Details for the file mslk_cuda_nightly-2026.4.15-cp314-cp314-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.15-cp314-cp314-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 9186cd0237aceb8f9f52d769e55d17a0bd9ffe76b63d4522247382a938acacc5
MD5 6e58037a9f74f13e0149d36f871d19f3
BLAKE2b-256 0843fb9fc4dcfa9b56010aa2ba9ab55137dd45baac9acab3a9ed6df38c4bed8f

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.15-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.15-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a3a76dfe1f8874b78bda75729fd5d2c88de2e63fe4907002c93015aa01c6b39a
MD5 551d1aefee2dc8e53fa54ce18e883c72
BLAKE2b-256 517d43dfd010772936dac8b8c4c67a2449fb897984cc72ddfac58d009423141e

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.15-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.15-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 5327ba6ec32a97e5647f0a57f350b6c008ded36eb0ba59788bbb120b76f87b0a
MD5 0844e02ce984bf5450edb88b7ead4743
BLAKE2b-256 4a96835aa50496e323ec6c25146cabc9bb42b8a55327bb182dda9e6a9370feb9

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.15-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.15-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 ab7e9265aaf3b7ba4e575c7c8bf1b43e923ea36c03b7182d1cd3e78e7a23aa69
MD5 dd51d006f9b7392ba6bee0b644559432
BLAKE2b-256 761a608d34d460e939b82897672352ec4578ae08a136d85603b35d045cc98d2e

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.15-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.15-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a5170ca572ce61ca13e72775e973addfea8ffbaab5063089808dba3297736fb2
MD5 e851c6fcb9b32e0fd6810a352a200059
BLAKE2b-256 40c48e251e13e0732603f58f51f10d150af9086fe2b099e00542149785789d35

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page