Skip to main content

No project description provided

Project description

MSLK Logo

MSLK Library

MSLK (Meta Superintelligence Labs Kernels, formerly known as FBGEMM GenAI) is a collection of high-performance kernels and optimizations built on top of PyTorch primitives for GenAI training and inference.

Installation

# Install MSLK for CUDA
pip install mslk-cuda==1.0.0
# Install MSLK for ROCm
pip install mslk-rocm==1.0.0
# Install a nightly CUDA version
pip install --pre mslk --index-url https://download.pytorch.org/whl/nightly/cu128
# Install a nightly ROCm version
pip install --pre mslk --index-url https://download.pytorch.org/whl/nightly/rocm7.1/

Release Compatibility Table

MSLK is released in accordance to the PyTorch release schedule, and each release has no guarantee to work in conjunction with PyTorch releases that are older than the one that the MSLK release corresponds to.

MSLK Release Corresponding PyTorch Release Supported Python Versions Supported CUDA Versions Supported CUDA Architectures Supported ROCm Versions Supported ROCm Architectures
1.1.0 2.11.x 3.10, 3.11, 3.12, 3.13, 3.14 12.6, 12.8, 12.9, 13.0 8.0, 9.0a, 10.0a, 12.0a 7.0, 7.1 gfx908, gfx90a, gfx942, gfx950
1.0.0 2.10.x 3.10, 3.11, 3.12, 3.13, 3.14 12.6, 12.8, 12.9, 13.0 8.0, 9.0a, 10.0a, 12.0a 7.1, 7.2 gfx908, gfx90a, gfx942, gfx950

Note that the supported CUDA/ROCm Architectures refer to compiled C++ kernels. In addition, some kernels (e.g. CUTLASS/CK) would be specific to certain architectures. Python JIT DSL based kernels (e.g. Triton) would potentially work on wider variety of architectures.

Running Benchmarks

python bench/gemm/gemm_bench.py --M 4096 --N 4096 --K 4096
python bench/quantize/quantize_bench.py --M 4096 --K 4096
python bench/conv/conv_bench.py

Running Tests

pytest test/gemm/gemm_test.py
pytest test/quantize/fp8_quantize_correctness_test.py
pytest test/conv/conv_test.py

Build From Source

We only support building on Linux. See the release compatibility table above for supported versions of Python, CUDA, ROCm.

# Clone repo
git clone https://github.com/meta-pytorch/MSLK
cd MSLK
git submodule sync
git submodule update --init --recursive
# Build and install
# The script will create a conda environment and install the required dependencies.
# The conda environment will look something like: build-py3.14-torchnightly-cuda12.9.1
./ci/integration/mslk_oss_build.bash
# After the initial environment setup, you can activate the environment and iterate faster:
conda activate build-py3.14-torchnightly-cuda12.9.1
python setup.py install

Python-Only Build

If you don't need the C++/CUDA kernels (e.g. for testing Python only changes), you can install MSLK in Python-only mode by setting the MSLK_PYTHON_ONLY environment variable. This skips the C++/CUDA compilation entirely.

MSLK_PYTHON_ONLY=1 pip install -e .

Join the MSLK community

For questions, support, news updates, or feature requests, please feel free to:

For contributions, please see the CONTRIBUTING file for ways to help out.

License

MSLK is BSD licensed, as found in the LICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

mslk_cuda_nightly-2026.3.28-cp314-cp314-manylinux_2_28_x86_64.whl (48.2 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.3.28-cp313-cp313-manylinux_2_28_x86_64.whl (48.2 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.3.28-cp312-cp312-manylinux_2_28_x86_64.whl (46.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.3.28-cp311-cp311-manylinux_2_28_x86_64.whl (46.6 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.3.28-cp310-cp310-manylinux_2_28_x86_64.whl (48.2 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

File details

Details for the file mslk_cuda_nightly-2026.3.28-cp314-cp314-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.3.28-cp314-cp314-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 f879a73c74611da1ac7a694a3e0009f0749e7512f24e213d31b0a7cdfb879543
MD5 b233eba9723e252788ecbeda25e56203
BLAKE2b-256 9ea4cdca02dacb62a63664a1123e3c8a12723aa6ed3552d97f09e4a7d24cc916

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.3.28-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.3.28-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 b1e564dacc94365673f0ed54d474d7dfb72a0fb7a139b23f8e86be9069ade61e
MD5 c6f517206c5b274f958f4d519e22bfa8
BLAKE2b-256 dca123e921ffc6ad6453a668d0e9e4aa9634d0a2b6533b4b46ae394d981e1c0a

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.3.28-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.3.28-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 54760389a883e46e9b79a23d18a5548013d1f884928b5f57dd520b4ce8388620
MD5 fd4604bacb2346f432d267ab34fe070c
BLAKE2b-256 f474f4cad8a2f9cbe045ab28a2fa6497d29e056237d88c35861353e28dd1444d

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.3.28-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.3.28-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 49eb587d9444b625d5551d4c711976c03ef7934b2d15c423415ec8e44eea23e7
MD5 9defd2b2104a03c11d0c6dc04db0584a
BLAKE2b-256 065c3f0e0fa41822150505f592bdab1ede9168f1c431e9810b1cccab71be6f98

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.3.28-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.3.28-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 f36cee17750f435305707f1fa8365fb11473a170ac489714c58f0cd0df95a3ff
MD5 27b1351d436cdee970044cff3595f632
BLAKE2b-256 7b13fdc7bcb245cb4d25073ba22c9422f489b3c31bf77490ab9ddbfcba6b2383

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page