Skip to main content

No project description provided

Project description

MSLK Logo

MSLK Library

MSLK (Meta Superintelligence Labs Kernels, formerly known as FBGEMM GenAI) is a collection of high-performance kernels and optimizations built on top of PyTorch primitives for GenAI training and inference.

Installation

# Install MSLK for CUDA
pip install mslk-cuda==1.0.0
# Install MSLK for ROCm
pip install mslk-rocm==1.0.0
# Install a nightly CUDA version
pip install --pre mslk --index-url https://download.pytorch.org/whl/nightly/cu128
# Install a nightly ROCm version
pip install --pre mslk --index-url https://download.pytorch.org/whl/nightly/rocm7.1/

Release Compatibility Table

MSLK is released in accordance to the PyTorch release schedule, and each release has no guarantee to work in conjunction with PyTorch releases that are older than the one that the MSLK release corresponds to.

MSLK Release Corresponding PyTorch Release Supported Python Versions Supported CUDA Versions Supported CUDA Architectures Supported ROCm Versions Supported ROCm Architectures
1.1.0 2.11.x 3.10, 3.11, 3.12, 3.13, 3.14 12.6, 12.8, 12.9, 13.0 8.0, 9.0a, 10.0a, 12.0a 7.0, 7.1 gfx908, gfx90a, gfx942, gfx950
1.0.0 2.10.x 3.10, 3.11, 3.12, 3.13, 3.14 12.6, 12.8, 12.9, 13.0 8.0, 9.0a, 10.0a, 12.0a 7.1, 7.2 gfx908, gfx90a, gfx942, gfx950

Note that the supported CUDA/ROCm Architectures refer to compiled C++ kernels. In addition, some kernels (e.g. CUTLASS/CK) would be specific to certain architectures. Python JIT DSL based kernels (e.g. Triton) would potentially work on wider variety of architectures.

Running Benchmarks

python bench/gemm/gemm_bench.py --M 4096 --N 4096 --K 4096
python bench/quantize/quantize_bench.py --M 4096 --K 4096
python bench/conv/conv_bench.py

Running Tests

pytest test/gemm/gemm_test.py
pytest test/quantize/fp8_quantize_correctness_test.py
pytest test/conv/conv_test.py

Build From Source

We only support building on Linux. See the release compatibility table above for supported versions of Python, CUDA, ROCm.

# Clone repo
git clone https://github.com/meta-pytorch/MSLK
cd MSLK
git submodule sync
git submodule update --init --recursive
# Build and install
# The script will create a conda environment and install the required dependencies.
# The conda environment will look something like: build-py3.14-torchnightly-cuda12.9.1
./ci/integration/mslk_oss_build.bash
# After the initial environment setup, you can activate the environment and iterate faster:
conda activate build-py3.14-torchnightly-cuda12.9.1
python setup.py install

Python-Only Build

If you don't need the C++/CUDA kernels (e.g. for testing Python only changes), you can install MSLK in Python-only mode by setting the MSLK_PYTHON_ONLY environment variable. This skips the C++/CUDA compilation entirely.

MSLK_PYTHON_ONLY=1 pip install -e .

Join the MSLK community

For questions, support, news updates, or feature requests, please feel free to:

For contributions, please see the CONTRIBUTING file for ways to help out.

License

MSLK is BSD licensed, as found in the LICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

mslk_cuda_nightly-2026.4.3-cp314-cp314-manylinux_2_28_x86_64.whl (47.6 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.3-cp313-cp313-manylinux_2_28_x86_64.whl (47.6 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.3-cp312-cp312-manylinux_2_28_x86_64.whl (46.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.3-cp311-cp311-manylinux_2_28_x86_64.whl (46.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.3-cp310-cp310-manylinux_2_28_x86_64.whl (46.0 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

File details

Details for the file mslk_cuda_nightly-2026.4.3-cp314-cp314-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.3-cp314-cp314-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 a55236f27d3aa31b54cce0a73016423acff4133544ed67b7ffee95da1415be8c
MD5 53aea6ce549ad199eb7b28ff5bfcfbcd
BLAKE2b-256 ab831cefb1c65cd2c9143984d986c69eba15c0d9d2cd076a2ce91f498c7ed3c2

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.3-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.3-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 6b8e2fa721f8f4394d2ef57c6f3e40946037ccb5b9afb3a93aad0eecb5f7c01c
MD5 75b9eadb884094ead16c8b445de28788
BLAKE2b-256 41d037c0ed42dd83bcdb5669c50e73eef621605d13cc5644b222b0b557a75967

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.3-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.3-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 4ed6eddfac8938addc90d436062d363041d20ead76561ef4b453143e0f754b50
MD5 da10494a41a3fd3374fe90804c4e17fc
BLAKE2b-256 5e31f43171a3acfa096742e0dcaace67e1599451063cd3d5af4065b47b98f452

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.3-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.3-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 1582be7d1766b2594dee6fa3ffa76eaf9e0af429977eff4719984bb512c5396d
MD5 437ef8589dc7cc596cc15760eb2474e3
BLAKE2b-256 16916b9930448eea86a4ca45987b96399d9bcc5770e30a72834286718d592e9c

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.3-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.3-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 ab2969c8b12e130661b5c3d65cbebdc3d371a7154edffe8328d400f0a262a0e1
MD5 4b7c5ff881ef1753e2e318aa2c16ca05
BLAKE2b-256 8a0063e7620d6e9beac50369717680ad051a08d7a7c93b5badd062befe6c9f6a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page