Skip to main content

No project description provided

Project description

MSLK Logo

MSLK Library

MSLK (Meta Superintelligence Labs Kernels, formerly known as FBGEMM GenAI) is a collection of high-performance kernels and optimizations built on top of PyTorch primitives for GenAI training and inference.

Installation

# Install MSLK for CUDA
pip install mslk-cuda==1.0.0
# Install MSLK for ROCm
pip install mslk-rocm==1.0.0
# Install a nightly CUDA version
pip install --pre mslk --index-url https://download.pytorch.org/whl/nightly/cu128
# Install a nightly ROCm version
pip install --pre mslk --index-url https://download.pytorch.org/whl/nightly/rocm7.1/

Release Compatibility Table

MSLK is released in accordance to the PyTorch release schedule, and each release has no guarantee to work in conjunction with PyTorch releases that are older than the one that the MSLK release corresponds to.

MSLK Release Corresponding PyTorch Release Supported Python Versions Supported CUDA Versions Supported CUDA Architectures Supported ROCm Versions Supported ROCm Architectures
1.1.0 2.11.x 3.10, 3.11, 3.12, 3.13, 3.14 12.6, 12.8, 12.9, 13.0 8.0, 9.0a, 10.0a, 12.0a 7.0, 7.1 gfx908, gfx90a, gfx942, gfx950
1.0.0 2.10.x 3.10, 3.11, 3.12, 3.13, 3.14 12.6, 12.8, 12.9, 13.0 8.0, 9.0a, 10.0a, 12.0a 7.1, 7.2 gfx908, gfx90a, gfx942, gfx950

Note that the supported CUDA/ROCm Architectures refer to compiled C++ kernels. In addition, some kernels (e.g. CUTLASS/CK) would be specific to certain architectures. Python JIT DSL based kernels (e.g. Triton) would potentially work on wider variety of architectures.

Running Benchmarks

python bench/gemm/gemm_bench.py --M 4096 --N 4096 --K 4096
python bench/quantize/quantize_bench.py --M 4096 --K 4096
python bench/conv/conv_bench.py

Running Tests

pytest test/gemm/gemm_test.py
pytest test/quantize/fp8_quantize_correctness_test.py
pytest test/conv/conv_test.py

Build From Source

We only support building on Linux. See the release compatibility table above for supported versions of Python, CUDA, ROCm.

# Clone repo
git clone https://github.com/meta-pytorch/MSLK
cd MSLK
git submodule sync
git submodule update --init --recursive
# Build and install
# The script will create a conda environment and install the required dependencies.
# The conda environment will look something like: build-py3.14-torchnightly-cuda12.9.1
./ci/integration/mslk_oss_build.bash
# After the initial environment setup, you can activate the environment and iterate faster:
conda activate build-py3.14-torchnightly-cuda12.9.1
python setup.py install

Python-Only Build

If you don't need the C++/CUDA kernels (e.g. for testing Python only changes), you can install MSLK in Python-only mode by setting the MSLK_PYTHON_ONLY environment variable. This skips the C++/CUDA compilation entirely.

MSLK_PYTHON_ONLY=1 pip install -e .

Join the MSLK community

For questions, support, news updates, or feature requests, please feel free to:

For contributions, please see the CONTRIBUTING file for ways to help out.

License

MSLK is BSD licensed, as found in the LICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

mslk_cuda_nightly-2026.3.29-cp314-cp314-manylinux_2_28_x86_64.whl (48.2 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.3.29-cp313-cp313-manylinux_2_28_x86_64.whl (46.6 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.3.29-cp312-cp312-manylinux_2_28_x86_64.whl (48.2 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.3.29-cp311-cp311-manylinux_2_28_x86_64.whl (48.2 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.3.29-cp310-cp310-manylinux_2_28_x86_64.whl (46.6 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

File details

Details for the file mslk_cuda_nightly-2026.3.29-cp314-cp314-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.3.29-cp314-cp314-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 ba8eedf4a6377cadf5ca4be044eb44de8efd4800fa4944ad1be43da30833a765
MD5 88122c722cc5462064cdb90854f90094
BLAKE2b-256 b66744bec57ce4380c99f7c44af97874350f8703f448de113780d05ab7a448ae

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.3.29-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.3.29-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 6ea5c1dac3415bf2b8a8b1b8b9eaf2a2d3619aba687e59649c8a588f7babe72d
MD5 682ca37c5f24ea390903ebbb60e7ebaa
BLAKE2b-256 72b6e31747ac996ef9d88771b586538831878a00788150580c8fafa3c70bd5e8

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.3.29-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.3.29-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 9981ff59e8286c4fd69640982c0b1400289ca48335e5cd059b116cefb292e45b
MD5 b87524a199f7f5660b557b188c85f42e
BLAKE2b-256 7a8ebe3d5233d6f091bc11b02e7375529f2159ce72e7eae74052993f4666285b

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.3.29-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.3.29-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 49311b1c36249ff2abea0b14ceb0c732296db828c6670a1076bb52c77a0b68f0
MD5 e9c15ccbbb6930d9392c02a9765f7542
BLAKE2b-256 a2435805deab834d1915037b5fbc9e739e1aa09973bb2acf7116911eb9439ff7

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.3.29-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.3.29-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 1d16b48b8f12026708a168f21310752083b40dc55c2b72a46fb4c5d782fb6ed1
MD5 89ba86e6447a4089df3a2326031f0682
BLAKE2b-256 d9484abdc1fc8128198a8ff822b1ec99258246992fdababaab32d876b7559de2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page