Skip to main content

No project description provided

Project description

MSLK Logo

MSLK Library

MSLK (Meta Superintelligence Labs Kernels, formerly known as FBGEMM GenAI) is a collection of high-performance kernels and optimizations built on top of PyTorch primitives for GenAI training and inference.

Installation

# Install MSLK for CUDA
pip install mslk-cuda==1.0.0
# Install MSLK for ROCm
pip install mslk-rocm==1.0.0
# Install a nightly CUDA version
pip install --pre mslk --index-url https://download.pytorch.org/whl/nightly/cu128
# Install a nightly ROCm version
pip install --pre mslk --index-url https://download.pytorch.org/whl/nightly/rocm7.1/

Release Compatibility Table

MSLK is released in accordance to the PyTorch release schedule, and each release has no guarantee to work in conjunction with PyTorch releases that are older than the one that the MSLK release corresponds to.

MSLK Release Corresponding PyTorch Release Supported Python Versions Supported CUDA Versions Supported CUDA Architectures Supported ROCm Versions Supported ROCm Architectures
1.1.0 2.11.x 3.10, 3.11, 3.12, 3.13, 3.14 12.6, 12.8, 12.9, 13.0 8.0, 9.0a, 10.0a, 12.0a 7.0, 7.1 gfx908, gfx90a, gfx942, gfx950
1.0.0 2.10.x 3.10, 3.11, 3.12, 3.13, 3.14 12.6, 12.8, 12.9, 13.0 8.0, 9.0a, 10.0a, 12.0a 7.1, 7.2 gfx908, gfx90a, gfx942, gfx950

Note that the supported CUDA/ROCm Architectures refer to compiled C++ kernels. In addition, some kernels (e.g. CUTLASS/CK) would be specific to certain architectures. Python JIT DSL based kernels (e.g. Triton) would potentially work on wider variety of architectures.

Running Benchmarks

python bench/gemm/gemm_bench.py --M 4096 --N 4096 --K 4096
python bench/quantize/quantize_bench.py --M 4096 --K 4096
python bench/conv/conv_bench.py

Running Tests

pytest test/gemm/gemm_test.py
pytest test/quantize/fp8_quantize_correctness_test.py
pytest test/conv/conv_test.py

Build From Source

We only support building on Linux. See the release compatibility table above for supported versions of Python, CUDA, ROCm.

# Clone repo
git clone https://github.com/meta-pytorch/MSLK
cd MSLK
git submodule sync
git submodule update --init --recursive
# Build and install
# The script will create a conda environment and install the required dependencies.
# The conda environment will look something like: build-py3.14-torchnightly-cuda12.9.1
./ci/integration/mslk_oss_build.bash
# After the initial environment setup, you can activate the environment and iterate faster:
conda activate build-py3.14-torchnightly-cuda12.9.1
python setup.py install

Python-Only Build

If you don't need the C++/CUDA kernels (e.g. for testing Python only changes), you can install MSLK in Python-only mode by setting the MSLK_PYTHON_ONLY environment variable. This skips the C++/CUDA compilation entirely.

MSLK_PYTHON_ONLY=1 pip install -e .

Join the MSLK community

For questions, support, news updates, or feature requests, please feel free to:

For contributions, please see the CONTRIBUTING file for ways to help out.

License

MSLK is BSD licensed, as found in the LICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

mslk_cuda_nightly-2026.4.5-cp314-cp314-manylinux_2_28_x86_64.whl (47.6 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.5-cp313-cp313-manylinux_2_28_x86_64.whl (47.6 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.5-cp312-cp312-manylinux_2_28_x86_64.whl (47.6 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.5-cp311-cp311-manylinux_2_28_x86_64.whl (47.6 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.5-cp310-cp310-manylinux_2_28_x86_64.whl (47.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

File details

Details for the file mslk_cuda_nightly-2026.4.5-cp314-cp314-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.5-cp314-cp314-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 233d7f5cbbef337fd09f52ae7bce15cac8a6265cdfb0502a88c836d90519c5d3
MD5 045b555a94319d095c979c694d3cbe78
BLAKE2b-256 f7c48e4dd3d993ca076a2b5f03de19701801bb579aa5e1dd098e52c54a689c19

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.5-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.5-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 9fa6ad12c871b1b45ab79148012484c20a1d065218819b24ea49044f9e744d49
MD5 c39b8233d42982075ecf93006df3fae0
BLAKE2b-256 6f457f41481bb18176d6b69db0268a3a9efbc6bbf42d3dc71306b15e9401de04

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.5-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.5-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 07b80d7e511d67a63988444b772e8977b848d74a095b06240790ec28fabc8be6
MD5 10854a065a01a5c6f393ba23ea274e7e
BLAKE2b-256 a6f981e2227d8eb2c76d31b2445d5396d9af1146cf0e423d99dff90b9ac472f2

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.5-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.5-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 1318ad8a50b9c77b52b3e4c2379e4ce1de3cc9b31bf4c3413fdcbc59fa0b3fc3
MD5 ca6032b3f06445cc4f90abaa9be73230
BLAKE2b-256 209e0b209c316d9ff04c29c4e813110dc9132acc97ef442e6c0c215d76831916

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.5-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.5-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a890dc768dbd28624da935fe960ae10a6048e167fa647b926d240d86dc9a9dc6
MD5 7d06b82e1bd4f14764890943f736e932
BLAKE2b-256 b36fdfcccc014d29fcb4a54f4353dd5e2d50c1ab44070bee45347e51a357fdf9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page