Skip to main content

TorchWrapper is a deep learning helper.

Project description

TensorWrapper

TensorWrapper is a extension library for PyTorch framework. It aims to supplement a few of common components: newest optimizer, opeartors, utils, drawer, common structure and etc.

Installation

# install 3rd pip depedency.
pip install cython matplotlib opencv-python numpy tensorboard future memory_profiler profilehooks tqdm scipy scikit-image
HOROVOD_GPU_OPERATIONS=NCCL pip install horovod

Distributed Train/Val

Install openmpi

# install openmpi 4.0 version
curl -O -L https://download.open-mpi.org/release/open-mpi/v4.0/openmpi-4.0.1.tar.gz
tar xvzf openmpi-4.0.1.tar.gz
./configure --prefix=/usr/local
make all && sudo make install
export PATH=/usr/local/bin:$PATH

# or via conda
conda install openmpi

Install NCCL

# download nccl library: https://developer.nvidia.com/nccl/nccl-legacy-downloads
# O/S agnostic local installer
# e.g. nccl_2.6.4-1+cuda10.0_x86_64.txz

# or using deb fashion
# https://developer.nvidia.com/compute/machine-learning/nccl/secure/v2.6/prod/nccl-repo-ubuntu1604-2.6.4-ga-cuda10.0_1-1_amd64.deb
sudo apt install libnccl2=2.6.4-1+cuda10.0 libnccl-dev=2.6.4-1+cuda10.0
sudo apt install libnccl2=2.6.4-1+cuda10.1 libnccl-dev=2.6.4-1+cuda10.1
export LD_LIBRARY_PATH=`pwd`/nccl_2.6.4-1+cuda10.0_x86_64/lib:$LD_LIBRARY_PATH

Install Horovod

HOROVOD_GPU_OPERATIONS=NCCL pip install horovod --no-cache-dir

git config --global user.email "atranitell@gmail.com" && git config --global user.name "jk"

Install CMake

# install cmake
# https://cmake.org/files/v3.14/

Train

# demo for verificaiton distributed traning
cd research/Classifier

# execute single node for mnist, note that batch size is set to 128
python Classifier.py --config configs/Classifier_Mnist_LeNet.py

# execute 4 node with 4 gpu, note that batch size should be set to 32
python -m tw.api.launch --np 4 --device cuda python Classifier.py --config configs/Classifier_Mnist_LeNet.py

# monitor the validation result, the test error should be similiar.

Usage

# dist train
python -m tw.api.launch --np 2 --dev cuda python research/classification/Classifier.py --config research/classification/configs/Classifier_ImageNet_AlexNet.py --task train

# dist eval
python -m tw.api.launch --np 2 --dev cuda python research/classification/Classifier.py --config research/classification/configs/Classifier_ImageNet_AlexNet.py --task test

# single train
python research/classification/Classifier.py --config research/classification/configs/Classifier_ImageNet_AlexNet.py --task train

# single eval
python research/classification/Classifier.py --config research/classification/configs/Classifier_ImageNet_AlexNet.py --task test

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tw-3.9.0.tar.gz (311.8 kB view details)

Uploaded Source

File details

Details for the file tw-3.9.0.tar.gz.

File metadata

  • Download URL: tw-3.9.0.tar.gz
  • Upload date:
  • Size: 311.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.64.0 importlib-metadata/3.10.1 keyring/23.4.1 rfc3986/1.5.0 colorama/0.4.4 CPython/3.6.13

File hashes

Hashes for tw-3.9.0.tar.gz
Algorithm Hash digest
SHA256 c172d7486fbba6adc09e741a3336d2f21390a6e1f3ce66b6fc3cc0eafbd7dcf0
MD5 f81ec9ea2b44618af231da93d1ec06a8
BLAKE2b-256 7acd5ebf3a1631139efa3bbf0a5649eb00c21304820ffe9e1b294c87369a21eb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page