Bridge Megatron-Core to Hugging Face/Reinforcement Learning

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

baiyan

These details have not been verified by PyPI

Project description

Important

Megatron-Bridge is the official NVIDIA maintained version, we will migrate to Megatron-Bridge soon.

MBridge: Bridge Megatron-Core to Hugging Face/Reinforcement Learning

MBridge provides a seamless bridge between Hugging Face models and Megatron-Core's optimized implementation for efficient distributed training and inference. It also offers necessary tools and processes for integrating Reinforcement Learning (RL) with Megatron.

MBridge is a prototype project, the idea has been adopted as Megatron-Bridge. For more advanced features such as training loop, mixed precision(FP8, BF16, FP4 etc.), PEFT, please refer to Megatron-Bridge.

中文文档

202508 Update

Support loading FP8 HF weights directly when training DeepSeekV3 models with bfloat16, without saving extra Megatron-Core format weights (MTP included, based on the dequantization kernel provided by DeepSeek) see example/4

Overview

MBridge allows you to convert popular Hugging Face models to Megatron-Core format, enabling you to leverage advanced parallelism strategies for large-scale training and inference. The library supports various model architectures and simplifies the process of transitioning between these frameworks. For Reinforcement Learning workflows, MBridge provides interfaces and tools needed to connect RL algorithms with Megatron-optimized models.

Feature Highlights

Comprehensive Model Support: Support for various model architectures including MoE models (Mixture of Experts)
Online Weight Import: Online loading HF weights with various parallelism strategies, auto shard the weights, no need to save extra Megatron-Core format weights
Online Weight Export: Online exporting weights to HF format for inference engines, with support for TP/PP/CP/VPP/EP/ETP parallelism strategies
Memory Friendly: Use per-tensor strategies, minimize the memory peak when loading/exporting HF format weights.
Simple API: Intuitive interfaces for model conversion and weight management
Support Transformer Engine: Use the powerful Transformer Engine to accelerate Megatron-Core models for better performance (use_te=False is not supported now)

Installation

pip install mbridge

Quick Start

from megatron.core import parallel_state as mpu
from mbridge import AutoBridge

# Initialize distributed environment
mpu.initialize_model_parallel(
    tensor_model_parallel_size=tp,
    pipeline_model_parallel_size=pp,
    virtual_pipeline_model_parallel_size=vpp,
    context_parallel_size=cp,
    expert_model_parallel_size=ep,
)

# Load a model from Hugging Face
HF_MODEL_PATH = "/path/to/Qwen/Qwen2.5-7B-Instruct"
# or llama model
HF_MODEL_PATH = "/path/to/llama/llama3-8b-instruct"
bridge = AutoBridge.from_pretrained(HF_MODEL_PATH)

# Get a Megatron-Core model and load weights from Hugging Face
model = bridge.get_model(weight_path=HF_MODEL_PATH)

# Export weights back to Hugging Face format for inference engine
for key, weight in bridge.export_weights(model):
    # Process or save the exported weights
    print(f"Exported: {key}")

# save model with HF format
bridge.save_weights(model, "path/to/save/model", memory_efficient=False) # set memory_efficient=True if the model is vary large

Supported Models

Currently supported models:

Qwen2
Qwen2-MoE
Qwen3
Qwen3-MoE
LLaMA
DeepseekV3
Mixtral
Qwen2.5-VL
Mimo

Examples

The example directory contains scripts demonstrating common use cases:

0.load_model_and_generate_single_gpu.py: Loading a model and generating text on a single GPU
1.load_model_and_export_single_gpu.py: Loading a model and exporting weights on a single GPU
2.load_model_and_export_multiple_gpus.py: Loading a model and exporting weights using multiple GPUs with TP/PP/CP/VPP parallelism

Post Model Creation Callbacks

MBridge provides a set of post model creation callbacks to customize the model after it is created.

make_value_model: Add a value model to the model
freeze_moe_router: Freeze the router of the model

from mbridge.utils.post_creation_callbacks import make_value_model, freeze_moe_router

bridge = AutoBridge.from_pretrained(HF_MODEL_PATH)
model = bridge.get_model(weight_path=HF_MODEL_PATH, post_model_creation_callbacks=[make_value_model, freeze_moe_router])

Development Roadmap

MBridge will continue to maintain support for popular models, but will not develop more advanced features. See Megatron-Bridge for more advanced features.

Acknowledgements

veRL has adopted MBridge as a connector to Megatron-Core.
slime has adopted MBridge as Megatron-Core checkpoint converter.
Nemo-RL has adopted Megatron-Bridge as Megatron-Core connector.
Community contributions: Special Thanks to @Thaurun @liuzhenhai93 @jeffhong1997 from wechat team for contributing lots of VLM models support.

License

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

baiyan

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.15.1

Sep 22, 2025

0.13.1

Sep 5, 2025

0.13.0

Aug 5, 2025

0.1.13

Jul 31, 2025

0.1.0

Jun 20, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mbridge-0.15.1.tar.gz (99.3 kB view details)

Uploaded Sep 22, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

mbridge-0.15.1-py3-none-any.whl (122.2 kB view details)

Uploaded Sep 22, 2025 Python 3

File details

Details for the file mbridge-0.15.1.tar.gz.

File metadata

Download URL: mbridge-0.15.1.tar.gz
Upload date: Sep 22, 2025
Size: 99.3 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mbridge-0.15.1.tar.gz
Algorithm	Hash digest
SHA256	`26028dbe7dc9193c6c6b3de56e82691c9a484f46ded1ed18a41f2747a08e8104`
MD5	`0a20d88e58643ce5c238aa4008961e22`
BLAKE2b-256	`653f04534fb2b4ddf6b293ee41f839de8ae6542ef2dc1b32c7d25a4b5b18c1bb`

See more details on using hashes here.

Provenance

The following attestation bundles were made for mbridge-0.15.1.tar.gz:

Publisher: publish-to-pypi.yml on ISEEKYAN/mbridge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: mbridge-0.15.1.tar.gz
- Subject digest: 26028dbe7dc9193c6c6b3de56e82691c9a484f46ded1ed18a41f2747a08e8104
- Sigstore transparency entry: 545456538
- Sigstore integration time: Sep 22, 2025
Source repository:
- Permalink: ISEEKYAN/mbridge@0cd4ae23f2425da77a80cb3f517828452fa8e984
- Branch / Tag: refs/tags/v0.15.1
- Owner: https://github.com/ISEEKYAN
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-to-pypi.yml@0cd4ae23f2425da77a80cb3f517828452fa8e984
- Trigger Event: release

File details

Details for the file mbridge-0.15.1-py3-none-any.whl.

File metadata

Download URL: mbridge-0.15.1-py3-none-any.whl
Upload date: Sep 22, 2025
Size: 122.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mbridge-0.15.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5c7c007c2eb13bb2d0ce9bd6b313537024d0a7fad52179ff367635eafe5cd4cd`
MD5	`90844d6e3dbe0ef9d1d760e73486f11f`
BLAKE2b-256	`51dc8208746f41ff08d0b6eb8c261569d57af97648ee0d1631ea273e68c627dc`

See more details on using hashes here.

Provenance

The following attestation bundles were made for mbridge-0.15.1-py3-none-any.whl:

Publisher: publish-to-pypi.yml on ISEEKYAN/mbridge

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: mbridge-0.15.1-py3-none-any.whl
- Subject digest: 5c7c007c2eb13bb2d0ce9bd6b313537024d0a7fad52179ff367635eafe5cd4cd
- Sigstore transparency entry: 545456539
- Sigstore integration time: Sep 22, 2025
Source repository:
- Permalink: ISEEKYAN/mbridge@0cd4ae23f2425da77a80cb3f517828452fa8e984
- Branch / Tag: refs/tags/v0.15.1
- Owner: https://github.com/ISEEKYAN
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish-to-pypi.yml@0cd4ae23f2425da77a80cb3f517828452fa8e984
- Trigger Event: release

mbridge 0.15.1

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Project description

Important

MBridge: Bridge Megatron-Core to Hugging Face/Reinforcement Learning

202508 Update

Overview

Feature Highlights

Installation

Quick Start

Supported Models

Examples

Post Model Creation Callbacks

Development Roadmap

Acknowledgements

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance