Skip to main content

No project description provided

Project description

MSLK Logo

MSLK Library

MSLK (Meta Superintelligence Labs Kernels, formerly known as FBGEMM GenAI) is a collection of high-performance kernels and optimizations built on top of PyTorch primitives for GenAI training and inference.

Installation

# Install MSLK for CUDA
pip install mslk-cuda==1.0.0
# Install MSLK for ROCm
pip install mslk-rocm==1.0.0
# Install a nightly CUDA version
pip install --pre mslk --index-url https://download.pytorch.org/whl/nightly/cu128
# Install a nightly ROCm version
pip install --pre mslk --index-url https://download.pytorch.org/whl/nightly/rocm7.1/

Release Compatibility Table

MSLK is released in accordance to the PyTorch release schedule, and each release has no guarantee to work in conjunction with PyTorch releases that are older than the one that the MSLK release corresponds to.

MSLK Release Corresponding PyTorch Release Supported Python Versions Supported CUDA Versions Supported CUDA Architectures Supported ROCm Versions Supported ROCm Architectures
1.1.0 2.11.x 3.10, 3.11, 3.12, 3.13, 3.14 12.6, 12.8, 12.9, 13.0 8.0, 9.0a, 10.0a, 12.0a 7.0, 7.1 gfx908, gfx90a, gfx942, gfx950
1.0.0 2.10.x 3.10, 3.11, 3.12, 3.13, 3.14 12.6, 12.8, 12.9, 13.0 8.0, 9.0a, 10.0a, 12.0a 7.1, 7.2 gfx908, gfx90a, gfx942, gfx950

Note that the supported CUDA/ROCm Architectures refer to compiled C++ kernels. In addition, some kernels (e.g. CUTLASS/CK) would be specific to certain architectures. Python JIT DSL based kernels (e.g. Triton) would potentially work on wider variety of architectures.

Running Benchmarks

python bench/gemm/gemm_bench.py --M 4096 --N 4096 --K 4096
python bench/quantize/quantize_bench.py --M 4096 --K 4096
python bench/conv/conv_bench.py

Running Tests

pytest test/gemm/gemm_test.py
pytest test/quantize/fp8_quantize_correctness_test.py
pytest test/conv/conv_test.py

Build From Source

We only support building on Linux. See the release compatibility table above for supported versions of Python, CUDA, ROCm.

# Clone repo
git clone https://github.com/meta-pytorch/MSLK
cd MSLK
git submodule sync
git submodule update --init --recursive
# Build and install
# The script will create a conda environment and install the required dependencies.
# The conda environment will look something like: build-py3.14-torchnightly-cuda12.9.1
./ci/integration/mslk_oss_build.bash
# After the initial environment setup, you can activate the environment and iterate faster:
conda activate build-py3.14-torchnightly-cuda12.9.1
python setup.py install

Python-Only Build

If you don't need the C++/CUDA kernels (e.g. for testing Python only changes), you can install MSLK in Python-only mode by setting the MSLK_PYTHON_ONLY environment variable. This skips the C++/CUDA compilation entirely.

MSLK_PYTHON_ONLY=1 pip install -e .

Join the MSLK community

For questions, support, news updates, or feature requests, please feel free to:

For contributions, please see the CONTRIBUTING file for ways to help out.

License

MSLK is BSD licensed, as found in the LICENSE file.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

mslk_cuda_nightly-2026.4.4-cp314-cp314-manylinux_2_28_x86_64.whl (46.0 MB view details)

Uploaded CPython 3.14manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.4-cp313-cp313-manylinux_2_28_x86_64.whl (46.0 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.4-cp312-cp312-manylinux_2_28_x86_64.whl (46.0 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.4-cp311-cp311-manylinux_2_28_x86_64.whl (46.0 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.28+ x86-64

mslk_cuda_nightly-2026.4.4-cp310-cp310-manylinux_2_28_x86_64.whl (46.0 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.28+ x86-64

File details

Details for the file mslk_cuda_nightly-2026.4.4-cp314-cp314-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.4-cp314-cp314-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 5750688f9e15049ae19c04257d659350339cc1b8a430d86a2d9f6b53792e8e0e
MD5 9f20b36a3e40fe75a3e80e713b2825f4
BLAKE2b-256 f52e63027ea327adc6a53ebf29e3ee1d0cc1e87b9c9d8b2a1284fe74277b2625

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.4-cp313-cp313-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.4-cp313-cp313-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 50508eea9c9c322125b94e6068f13172982fdc507ec488500ce51308626719b2
MD5 57495fa649ab9003fd1afdfec8b57f05
BLAKE2b-256 f9ea1d87157fcd7e0fc1ef50bec7d7eb9d6838a86a2694c69ee120d3cf0b05cf

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.4-cp312-cp312-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.4-cp312-cp312-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 dd63e2ab88fd50e8935e810d2827e47889b49dda7c5f8f34da7b93a9abde783b
MD5 9ffb11edd5941170971559a43703b335
BLAKE2b-256 49d911ba7913c7d3486c8fb8941396906087abab53953c9e5e2cfd43838a56e5

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.4-cp311-cp311-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.4-cp311-cp311-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 ebe207a314a39cd09a9c7c4d35135ba3462fe825c45106f44c0fcd8a63c172ec
MD5 242a73bbafdac881759a678c67d5a38c
BLAKE2b-256 2ba4762cac55fe671be5c1475621e162e398799a10f24ea372803d7f9e3f7704

See more details on using hashes here.

File details

Details for the file mslk_cuda_nightly-2026.4.4-cp310-cp310-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mslk_cuda_nightly-2026.4.4-cp310-cp310-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 81decd70c70a0a48bf44dd1c797611c29acc20faa40f42d87ce42fd3da36d758
MD5 36b4544359f603a2b42c4b2675a23458
BLAKE2b-256 3220c40249a19438f1be22146d9e2722d5f781efafce92a12df4c9701afb18d3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page