cuda-mock

mock cuda runtime api

Project description

The plt hook technology used refers to plthook

mock pytorch cuda runtime interface

update submodule
git submodule update --init --recursive
build wheel package
python setup.py sdist bdist_wheel
direct install
pip install dist/*.whl

collect cuda operator call stack

find nvcc installed path
which nvcc
replace nvcc with my nvcc
mv /usr/local/bin/nvcc /usr/local/bin/nvcc_b
chmod 777 tools/nvcc
cp tools/nvcc /usr/local/bin/nvcc
build and install pytorch
build and install cuda_mock
import cuda_mock after import torch
run your torch train script
we will dump the stack into console

收集cuda 算子调用堆栈

找到nvcc安装路径 which nvcc
用我们的nvcc替换系统的nvcc（我们只是在编译选项加了-g）
mv /usr/local/bin/nvcc /usr/local/bin/nvcc_b
chmod 777 tools/nvcc
cp tools/nvcc /usr/local/bin/nvcc
构建并且安装pytorch
构建并且安装cuda_mock
注意要在import torch之后import cuda_mock
开始跑你的训练脚本
我们将会把堆栈打印到控制台

收集统计xpu runtime 内存分配信息/`xpu_wait`调用堆栈

打印xpu_malloc调用序列，统计实时内存使用情况以及历史使用的峰值内存，排查内存碎片问题
打印xpu_wait调用堆栈，排查流水中断处问题
注意要在import torch/import paddle之后import cuda_mock; cuda_mock.xpu_initialize()

使用方法:

import paddle
import cuda_mock; cuda_mock.xpu_initialize() # 加入这一行

关闭打印backtrace（获取backtrace性能下降比较严重）
```
export HOOK_DISABLE_TRACE='xpuMemcpy=0,xpuSetDevice=0'
```

example

python test/test_import_mock.py

debug

export LOG_LEVEL=WARN,TRACE=INFO

Project details

Release history Release notifications | RSS feed

1.1.1

Jul 17, 2024

1.1.0

Jul 11, 2024

1.0.0

Jun 21, 2024

0.1.11

Jun 20, 2024

0.1.10

May 30, 2024

0.1.9

May 9, 2024

0.1.8

Apr 29, 2024

0.1.7

Apr 27, 2024

0.1.6

Apr 14, 2024

0.1.5

Mar 29, 2024

0.1.3

Feb 20, 2024

0.1.2

Feb 13, 2024

0.1.1

Feb 4, 2024

0.1.0

Jan 17, 2024

This version

0.0.9

Jan 12, 2024

0.0.8

Jan 12, 2024

0.0.6

Jan 10, 2024

0.0.5

Jan 9, 2024

0.0.4

Jan 4, 2024

0.0.3

Jan 3, 2024

0.0.2

Dec 25, 2023

0.0.1

Dec 22, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cuda-mock-0.0.9.tar.gz (19.6 kB view hashes)

Uploaded Jan 12, 2024 Source

Built Distribution

cuda_mock-0.0.9-py3-none-any.whl (1.0 MB view hashes)

Uploaded Jan 12, 2024 Python 3

Hashes for cuda-mock-0.0.9.tar.gz

Hashes for cuda-mock-0.0.9.tar.gz
Algorithm	Hash digest
SHA256	`19662fe661af7351149f998b1d2b26fada5a77a0cee1022750feac6e15ba8456`
MD5	`82fe4c0c7702f5ca2a6dbcca211e793f`
BLAKE2b-256	`e4a4829b159de5967152bcbbc663cc4b5e083a5e390e30a95d0c3e61c1774fcb`

Hashes for cuda_mock-0.0.9-py3-none-any.whl

Hashes for cuda_mock-0.0.9-py3-none-any.whl
Algorithm	Hash digest
SHA256	`28ef4d7746bcb1c396bd6f0eced8689db93030ea76eca2b9c94ed94e706e45cc`
MD5	`7888b411be45fa3be549998227f6712d`
BLAKE2b-256	`fe21c69d604e386314c3b0f4bed920dbbb4ea8cba0a1e110d6633b8bcbf93b5a`

cuda-mock 0.0.9

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

The plt hook technology used refers to plthook

mock pytorch cuda runtime interface

collect cuda operator call stack

收集cuda 算子调用堆栈

收集统计xpu runtime 内存分配信息/`xpu_wait`调用堆栈

example

debug

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

cuda-mock 0.0.9

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

The plt hook technology used refers to plthook

mock pytorch cuda runtime interface

collect cuda operator call stack

收集cuda 算子调用堆栈

收集统计xpu runtime 内存分配信息/xpu_wait调用堆栈

example

debug

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

收集统计xpu runtime 内存分配信息/`xpu_wait`调用堆栈