Skip to main content

Megatron Core - a library for efficient and scalable training of transformer based models

Project description

Megatron-Core

Megatron-Core is an open-source PyTorch-based library that contains GPU-optimized techniques and cutting-edge system-level optimizations. It abstracts them into composable and modular APIs, allowing full flexibility for developers and model researchers to train custom transformers at-scale on NVIDIA accelerated computing infrastructure. This library is compatible with all NVIDIA Tensor Core GPUs, including FP8 acceleration support for NVIDIA Hopper architectures.

Megatron-Core offers core building blocks such as attention mechanisms, transformer blocks and layers, normalization layers, and embedding techniques. Additional functionality like activation re-computation, distributed checkpointing is also natively built-in to the library. The building blocks and functionality are all GPU optimized, and can be built with advanced parallelization strategies for optimal training speed and stability on NVIDIA Accelerated Computing Infrastructure. Another key component of the Megatron-Core library includes advanced model parallelism techniques (tensor, sequence, pipeline, context, and MoE expert parallelism).

Megatron-Core can be used with NVIDIA NeMo, an enterprise-grade AI platform. Alternatively, you can explore Megatron-Core with the native PyTorch training loop here. Visit Megatron-Core documentation to learn more.

Quick links

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

megatron_core-0.9.0-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.24+ x86-64 manylinux: glibc 2.28+ x86-64

megatron_core-0.9.0-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.24+ x86-64 manylinux: glibc 2.28+ x86-64

File details

Details for the file megatron_core-0.9.0-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for megatron_core-0.9.0-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 b2c73c9e6fa58c93f3b1833ffd32bc08dc29b5d28fda7375c5a5e3a8aaeb3db8
MD5 88fbaaa28dd341803bf1cb0c2fb48d32
BLAKE2b-256 d7bf4eb7772f2a5830dd11d0590fd56ce1dff8c82a398d893bfb583116d423ed

See more details on using hashes here.

File details

Details for the file megatron_core-0.9.0-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for megatron_core-0.9.0-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 c0d929cf92f0aee68b18916b0191beec917e63a7766b4834852bef664202cc76
MD5 6aa725f3ae67304e679d1242417a8567
BLAKE2b-256 ff17cf9ab8e7aec4ab89e697e43e52d9801c8b788b79de5c0c810f154d7c0a2f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page