Skip to main content

msadaptor project

Project description

MSAdapter

简体中文

MSAdapter是一款MindSpore生态适配工具,在不改变用户原有使用习惯下,将PyTorch/JAX等三方框架代码快速迁移到MindSpore生态上,帮助用户高效使用昇腾算力。

简介

msadapter是将PyTorch训练脚本高效迁移至MindSpore框架执行的工具,其目的是在不改变原有PyTorch用户的使用习惯情况下,使得PyTorch代码能在昇腾上获得高效性能。

  • PyTorch接口支持: msadapter目前支持大部分PyTorch常用接口适配。用户接口使用方式不变,基于MindSpore动态图或静态图模式下执行在昇腾算力平台上。可以在torch接口支持列表中查看接口支持情况。

文档

有关安装指南、教程和API的更多详细信息,请参阅教程文档

msadapter入门指南

安装

首先查看版本说明选择所需的msadapter和MindSpore版本。

安装MindSpore

请根据MindSpore官网安装指南进行安装。

安装msadapter

① 步骤一:源码下载
 git clone https://gitee.com/mindspore/msadapter.git
② 步骤二:构建
 cd msadapter
 bash scripts/build.sh

构建完成后,msadapter目录下会新增一个build文件夹与一个dist文件夹。

③ 步骤三:安装

方式一:环境变量切换

  1. 源码
export PYTHONPATH=${MindSpeed_Core_MS_PATH}/msadapter/:$PYTHONPATH
export PYTHONPATH=${MindSpeed_Core_MS_PATH}/msadapter/msa_thirdparty:$PYTHONPATH
  1. 安装包
pip install ${MindSpeed_Core_MS_PATH}/msadapter/dist/*.whl
export PYTHONPATH=/*/site-packages/msa_thirdparty:$PYTHONPATH 
# /*/site-packages 指python环境下的安装包路径,可以使用pip show msadapter获取。

方式二:脚本前加一行,切换后端

import msadapter # 改为mindspore后端执行
import torch
from torch.nn import functional as F

脚本中控制是否使用msadapter的代码

msadapter.enable_torch_proxy(True)
msadapter.enable_torch_proxy(False)

使用msadapter

通过环境变量使能msadapter,脚本使用PyTorch

import torch
from torch import nn
from torch.nn import functional as F

net = nn.Linear(10, 1)

脚本使用PyTorch,最开始的位置引用msadapter

import msadapter # 改为mindspore后端执行

import torch
from torch import nn
from torch.nn import functional as F

net = nn.Linear(10, 1)

脚本直接使用msadapter

import msadapter
from msadapter import nn
from msadapter.nn import functional as F

net = nn.Linear(10, 1)

安装好msadapter后, 你可以按照以下方式使用它:

直接import torch即可将PyTorch代码适配到msadapter,在NPU设备上运行PyTorch代码
 import msadapter
 import torch
 from torch import nn
 from torch.utils.data import DataLoader
 from torchvision import datasets
 from torchvision.transforms import ToTensor

 # 1.Working with data
 # Download training data from open datasets.
 training_data = datasets.FashionMNIST(root="data", train=True, download=True, transform=ToTensor())
 # Download test data from open datasets.
 test_data = datasets.FashionMNIST(root="data", train=False, download=True, transform=ToTensor())

 # 2.Creating Models
 class NeuralNetwork(nn.Module):
     def __init__(self):
         super().__init__()
         self.flatten = nn.Flatten()
         self.linear_relu_stack = nn.Sequential(
             nn.Linear(28*28, 512),
             nn.ReLU(),
             nn.Linear(512, 512),
             nn.ReLU(),
             nn.Linear(512, 10)
         )

     def forward(self, x):
         x = self.flatten(x)
         logits = self.linear_relu_stack(x)
         return logits

 if __name__ == '__main__':
     train_dataloader = DataLoader(training_data, batch_size=64)
     test_dataloader = DataLoader(test_data, batch_size=64)

     # 3.create Models
     model = NeuralNetwork()

     classes = [
         "T-shirt/top",
         "Trouser",
         "Pullover",
         "Dress",
         "Coat",
         "Sandal",
         "Shirt",
         "Sneaker",
         "Bag",
         "Ankle boot",
     ]
     # 4.Predict
     model.eval()
     x, y = test_data[0][0], test_data[0][1]
     with torch.no_grad():
         pred = model(x)
         predicted, actual = classes[pred[0].argmax(0)], classes[y]
         print(f'Predicted: "{predicted}", Actual: "{actual}"')


安装完msadapter后,代码执行时torch同名的导入模块会自动被转换为msadapter相应的模块(目前支持torch、torchvision、torch_npu、torchair等相关模块的自动转换),接下来执行主入口的.py文件即可。更多的使用方式可以参考使用指南

版本说明

分支名 发布版本 发布时间 配套MindSpore版本
master - - MindSpore master

限制

目前MSAdapter的使用存在如下限制:

暂不支持Complex64/Complex128

示例代码:

  from torch.utils.data import DataLoader
  from torchvision import datasets
  from torchvision.transforms import ToTensor

  training_data = datasets.FashionMNIST(root="data", train=True, download=True, transform=ToTensor())
  train_dataloader = DataLoader(training_data, batch_size=64, pin_memory=True)
  for batch, (X, y) in enumerate(train_dataloader):
      X, y = X.cuda(), y.cuda()

报错信息如下:

 Traceback (most recent call last):
     File "/path/to/your/torch/utils/data/_utils/pin_memory.py", line 98, in pin_memory
         clone[i] = pin_memory(item, device)
     File "/path/to/your/torch/utils/data/_utils/pin_memory.py", line 64, in pin_memory
         return data.pin_memory(device)
 TypeError: pin_memory() takes 1 positional argument but 2 were given
暂不支持动态profiling

示例代码:

 import torch
 from torch.profiler import profile, record_function, ProfilerActivity

 with profile(activities=[ProfilerActivity.CPU], record_shapes=True) as prof:
     # 训练代码
     for i in range(10):
     # 模拟训练步骤
         pass

报错信息如下:

 Traceback (most recent call last):
     File "/path/to/your/demo.py", line 102, in <module>
         with profile(activities=[ProfilerActivity.CPU], record_shapes=True) as prof:
     File "/path/to/your/torch/profiler/profiler.py", line 54, in __init__
         profiler_level = experimental_config._profiler_level,
 AttributeError: 'NoneType' object has no attribute '_profiler_level'
Dataloader中的pin_memory参数仅支持设置为False

示例代码:

  from torch.utils.data import DataLoader
  from torchvision import datasets
  from torchvision.transforms import ToTensor

  training_data = datasets.FashionMNIST(root="data", train=True, download=True, transform=ToTensor())
  train_dataloader = DataLoader(training_data, batch_size=64, pin_memory=True)
  for batch, (X, y) in enumerate(train_dataloader):
      X, y = X.cuda(), y.cuda()

报错信息如下:

 Traceback (most recent call last):
     File "/path/to/your/torch/utils/data/_utils/pin_memory.py", line 98, in pin_memory
         clone[i] = pin_memory(item, device)
     File "/path/to/your/torch/utils/data/_utils/pin_memory.py", line 64, in pin_memory
         return data.pin_memory(device)
 TypeError: pin_memory() takes 1 positional argument but 2 were given
不支持tensor.backward()操作

示例代码:

  import torch
  x = torch.randn(2,)
  x.backward()

报错信息如下:

 Traceback (most recent call last):
     File "/path/to/your/demo.py", line XX, in <module>
         x.backward()
     File "/path/to/your/torch/_tensor.py", line 325, in backward
         raise ValueError('not support Tensor.backward yet.')
 ValueError: not support Tensor.backward yet.
不支持to(device)操作

示例代码:

  import torch
  x = torch.randn(2,)
  device = "cuda"
  x.to(device)

报错信息如下:

 Traceback (most recent call last):
     File "/path/to/your/demo.py", line XX, in <module>
         x.to(device)
     File "/path/to/your/mindspore/common/tensor.py", line 3018, in to
         return self if self.dtype == dtype else self._to(dtype)
 TypeError: _to(): argument 'dtype' (position 1) must be mstype, not str.

 ----------------------------------------------------
 - C++ Call Stack: (For framework developers)
 ----------------------------------------------------
 mindspore/ccsrc/pynative/op_function/converter.cc:657 Parse
MindSpore导出的ckpt文件无法被直接加载到PyTorch模型中

示例代码:

 import torch
 from torch import nn
 import mindspore as ms

 class NeuralNetwork(nn.Module):
     def __init__(self):
         super().__init__()
         self.linear = nn.Linear(28*28, 512)

     def forward(self, x):
         logits = self.linear(x)
         return logits

 class myNN(ms.nn.Cell):
     def __init__(self):
         super().__init__()
         self.linear = nn.Linear(28*28, 512)

     def construct(self, x):
         logits = self.linear(x)
         return logits

 model = myNN()
 ms.save_checkpoint(model, "./net.ckpt")
 model2 = NeuralNetwork()
 model.load_state_dict(torch.load("./net.ckpt"))

报错信息如下:

 Traceback (most recent call last):
     File "/path/to/your/demo.py", line 99, in <module>
         model.load_state_dict(torch.load("./mynn.ckpt"))
     File "/path/to/your/torch/serialization.py", line 1020, in load
         return _legacy_load(opened_file, pickle_module, **pickle_load_args)
     File "/path/to/your/torch/serialization.py", line 1118, in _legacy_load
         magic_number = pickle_module.load(f, **pickle_load_args)
 EOFError: Ran out of input
不支持MindSpore与MS-Adapter混合运行 import torch后,mindspore的部分行为会变更为torch的行为,从而产生不可预期的错误。

示例代码:

 from mindspore import Tensor

 a = Tensor([2, 2])
 print(f'before import torch: a.shape={a.shape}')

 import torch
 print(f'after import torch: a.shape={a.shape}')

执行结果如下,可以看到,import torch后,原本的mindspore.Tensor.shape行为发生了改变。

 before import torch: a.shape=(2,)
 after import torch: a.shape=torch.Size([2])

不支持混跑的MindSpore接口详见下表:

模块 受影响接口
mindspore.Tensor/mindspore.StubTensor is_shared
softmax
type_
retain_grad
shape
to_dense
_base
data
numel
nelement
repeat
cuda
npu
cpu
size
dim
clone
log_softmax
narrow
view
__or__
device
__and__
__xor__
__iter__
__reduce_ex__
expand
detach
T
transpose
mean
clamp
is_cuda
is_cpu
repeat_interleave
is_sparse
requires_grad
requires_grad_
unsqueeze
__pow__
float
backward
expand
split
norm
record_stream
data_ptr
pin_memory
grad
grad
__imul__
reshape
squeeze
element_size
exponential_

许可证

Apache License 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

msadapter-0.5.0b1-py3-none-any.whl (3.8 MB view details)

Uploaded Python 3

File details

Details for the file msadapter-0.5.0b1-py3-none-any.whl.

File metadata

  • Download URL: msadapter-0.5.0b1-py3-none-any.whl
  • Upload date:
  • Size: 3.8 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.10

File hashes

Hashes for msadapter-0.5.0b1-py3-none-any.whl
Algorithm Hash digest
SHA256 f1a5005afbfc13a42c47bd808edd5fdfb94677dd57123fa410a0b006b431acb2
MD5 7c0047ee78f8d2f817796ccde9c81117
BLAKE2b-256 32d6537b160f23296cbe99d48a41acf51a40ac956c532d3ae86bf2344db9c34b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page