TorchWrapper is a deep learning helper.
Project description
TensorWrapper
TensorWrapper is a extension library for PyTorch framework. It aims to supplement a few of common components: newest optimizer, opeartors, utils, drawer, common structure and etc.
Installation
# install 3rd pip depedency.
pip install cython matplotlib opencv-python numpy tensorboard future memory_profiler profilehooks tqdm scipy scikit-image
HOROVOD_GPU_OPERATIONS=NCCL pip install horovod
Distributed Train/Val
Install openmpi
# install openmpi 4.0 version
curl -O -L https://download.open-mpi.org/release/open-mpi/v4.0/openmpi-4.0.1.tar.gz
tar xvzf openmpi-4.0.1.tar.gz
./configure --prefix=/usr/local
make all && sudo make install
export PATH=/usr/local/bin:$PATH
# or via conda
conda install openmpi
Install NCCL
# download nccl library: https://developer.nvidia.com/nccl/nccl-legacy-downloads
# O/S agnostic local installer
# e.g. nccl_2.6.4-1+cuda10.0_x86_64.txz
# or using deb fashion
# https://developer.nvidia.com/compute/machine-learning/nccl/secure/v2.6/prod/nccl-repo-ubuntu1604-2.6.4-ga-cuda10.0_1-1_amd64.deb
sudo apt install libnccl2=2.6.4-1+cuda10.0 libnccl-dev=2.6.4-1+cuda10.0
sudo apt install libnccl2=2.6.4-1+cuda10.1 libnccl-dev=2.6.4-1+cuda10.1
export LD_LIBRARY_PATH=`pwd`/nccl_2.6.4-1+cuda10.0_x86_64/lib:$LD_LIBRARY_PATH
Install Horovod
HOROVOD_GPU_OPERATIONS=NCCL pip install horovod --no-cache-dir
git config --global user.email "atranitell@gmail.com" && git config --global user.name "jk"
Install CMake
# install cmake
# https://cmake.org/files/v3.14/
Train
# demo for verificaiton distributed traning
cd research/Classifier
# execute single node for mnist, note that batch size is set to 128
python Classifier.py --config configs/Classifier_Mnist_LeNet.py
# execute 4 node with 4 gpu, note that batch size should be set to 32
python -m tw.api.launch --np 4 --device cuda python Classifier.py --config configs/Classifier_Mnist_LeNet.py
# monitor the validation result, the test error should be similiar.
Usage
# dist train
python -m tw.api.launch --np 2 --dev cuda python research/classification/Classifier.py --config research/classification/configs/Classifier_ImageNet_AlexNet.py --task train
# dist eval
python -m tw.api.launch --np 2 --dev cuda python research/classification/Classifier.py --config research/classification/configs/Classifier_ImageNet_AlexNet.py --task test
# single train
python research/classification/Classifier.py --config research/classification/configs/Classifier_ImageNet_AlexNet.py --task train
# single eval
python research/classification/Classifier.py --config research/classification/configs/Classifier_ImageNet_AlexNet.py --task test
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
tw-3.9.0.tar.gz
(311.8 kB
view details)
File details
Details for the file tw-3.9.0.tar.gz
.
File metadata
- Download URL: tw-3.9.0.tar.gz
- Upload date:
- Size: 311.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.64.0 importlib-metadata/3.10.1 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.4 CPython/3.6.13
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c172d7486fbba6adc09e741a3336d2f21390a6e1f3ce66b6fc3cc0eafbd7dcf0 |
|
MD5 | f81ec9ea2b44618af231da93d1ec06a8 |
|
BLAKE2b-256 | 7acd5ebf3a1631139efa3bbf0a5649eb00c21304820ffe9e1b294c87369a21eb |