Skip to main content

mock cuda runtime api

Project description

The plt hook technology used refers to plthook

mock pytorch cuda runtime interface

  • update submodule
    git submodule update --init --recursive

  • build wheel package
    python setup.py sdist bdist_wheel

  • direct install
    pip install dist/*.whl

collect cuda operator call stack

  • find nvcc installed path
    which nvcc
  • replace nvcc with my nvcc
    mv /usr/local/bin/nvcc /usr/local/bin/nvcc_b
    chmod 777 tools/nvcc
    cp tools/nvcc /usr/local/bin/nvcc
  • build and install pytorch
  • build and install cuda_mock
  • import cuda_mock after import torch
  • run your torch train script
  • we will dump the stack into console

收集cuda 算子调用堆栈

  • 找到nvcc安装路径 which nvcc
  • 用我们的nvcc替换系统的nvcc(我们只是在编译选项加了-g
    mv /usr/local/bin/nvcc /usr/local/bin/nvcc_b
    chmod 777 tools/nvcc
    cp tools/nvcc /usr/local/bin/nvcc
  • 构建并且安装pytorch
  • 构建并且安装cuda_mock
  • 注意要在import torch之后import cuda_mock
  • 开始跑你的训练脚本
  • 我们将会把堆栈打印到控制台

收集统计xpu runtime 内存分配信息/xpu_wait调用堆栈

  • 打印xpu_malloc调用序列,统计实时内存使用情况以及历史使用的峰值内存,排查内存碎片问题

  • 打印xpu_wait调用堆栈,排查流水中断处问题

  • 注意要在import torch/import paddle之后import cuda_mock; cuda_mock.xpu_initialize()

  • 使用方法:

    import paddle
    import cuda_mock; cuda_mock.xpu_initialize() # 加入这一行
    
  • 关闭打印backtrace(获取backtrace性能下降比较严重)

    export HOOK_DISABLE_TRACE='xpuMemcpy=0,xpuSetDevice=0'
    

实现自定义hook函数

  • 实现自定义hook installer例子:

    class PythonHookInstaller(cuda_mock.HookInstaller):
        def is_target_lib(self, name):
            return name.find("libcuda_mock_impl.so") != -1
        def is_target_symbol(self, name):
            return name.find("malloc") != -1
    lib = cuda_mock.dynamic_obj(cpp_code, True).appen_compile_opts('-g').compile().get_lib()
    installer = PythonHookInstaller(lib)
    
  • 实现hook回调接口 PythonHookInstaller

  • 构造函数需要传入自定义hook函数的库路径(绝对路径 并且 传入库中必须存在与要替换的函数名字以及类型一致的函数 在hook发生过程中,将会把原函数的地址写入以__origin_为开头目标symbol接口的变量中,方便用户拿到原始函数地址 参考:test/py_test/test_import_mock.py:15处定义)

  • is_target_lib 是否是要hook的目标函数被调用的library

  • is_target_symbol 是否是要hook的目标函数名字(上面接口返回True才回调到这个接口)

  • new_symbol_name 构造函数中传入共享库中的新的用于替换的函数名字,参数name:当前准备替换的函数名字

  • dynamic_obj 可以运行时编译c++ code,支持引用所有模块:loggerstatistics

example

  • python test/test_import_mock.py

debug

  • export LOG_LEVEL=WARN,TRACE=INFO

环境变量

环境变量 用法示例 可选值 默认值 说明
LOG_LEVEL export LOG_LEVEL=WARN,TRACE=INFO 日志级别有:INFO,WARN,ERROR,FATAL, 日志模块有: PROFILE,TRACE,HOOK,PYTHON,LAST 全局日志级别默认为WARN,各个日志模块的默认日志级别为INFO 日志级别, 日志模块级别
HOOK_DISABLE_TRACE export HOOK_DISABLE_TRACE='xpuMemcpy=0,xpuSetDevice=0' xpuMalloc,xpuFree,xpuWait,xpuMemcpy,xpuSetDevice,xpuCurrentDeviceId 默认所有接口的的值均为1,即所有接口默认关闭backtrace 是否关闭backtrace
LOG_OUTPUT_PATH export LOG_OUTPUT_PATH='cuda_mock.log' 文件路径 - 是否将日志重定向到文件

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cuda-mock-0.1.7.tar.gz (28.7 kB view hashes)

Uploaded Source

Built Distributions

cuda_mock-0.1.7-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (566.1 kB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ x86-64

cuda_mock-0.1.7-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (566.1 kB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ x86-64

cuda_mock-0.1.7-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (566.1 kB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ x86-64

cuda_mock-0.1.7-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (566.1 kB view hashes)

Uploaded PyPy manylinux: glibc 2.17+ x86-64

cuda_mock-0.1.7-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (566.1 kB view hashes)

Uploaded CPython 3.12 manylinux: glibc 2.17+ x86-64

cuda_mock-0.1.7-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (566.1 kB view hashes)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

cuda_mock-0.1.7-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (566.1 kB view hashes)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

cuda_mock-0.1.7-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (566.1 kB view hashes)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

cuda_mock-0.1.7-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (566.1 kB view hashes)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

cuda_mock-0.1.7-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (566.1 kB view hashes)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64

cuda_mock-0.1.7-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (566.1 kB view hashes)

Uploaded CPython 3.6m manylinux: glibc 2.17+ x86-64

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page