A Modular, Configuration-Driven Framework for Knowledge Distillation. Trained models, training logs and configurations are available for ensuring the reproducibiliy.

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

torchdistill: A Modular, Configuration-Driven Framework for Knowledge Distillation

torchdistill (formerly kdkit) offers various knowledge distillation methods and enables you to design (new) experiments simply by editing a yaml file instead of Python code. Even when you need to extract intermediate representations in teacher/student models, you will NOT need to reimplement the models, that often change the interface of the forward, but instead specify the module path(s) in the yaml file.

Forward hook manager

Using ForwardHookManager, you can extract intermediate representations in model without modifying the interface of its forward function.
This example notebook will give you a better idea of the usage.

Top-1 validation accuracy for ILSVRC 2012 (ImageNet)

T: ResNet-34*	Pretrained	KD	AT	FT	CRD	Tf-KD	SSKD	L2	PAD-L2
S: ResNet-18	69.76*	71.37	70.90	71.56	70.93	70.52	70.09	71.08	71.71
Original work	N/A	N/A	70.70	N/A**	71.17	70.42	71.62	70.90	71.71

* The pretrained ResNet-34 and ResNet-18 are provided by torchvision.
** FT is assessed with ILSVRC 2015 in the original work.
For the 2nd row (S: ResNet-18), the checkpoint (trained weights), configuration and log files are available, and the configurations reuse the hyperparameters such as number of epochs used in the original work except for KD.

Examples

Executable code can be found in examples/ such as

Image classification: ImageNet (ILSVRC 2012), CIFAR-10, CIFAR-100, etc
Object detection: COCO 2017, etc
Semantic segmentation: COCO 2017, PASCAL VOC, etc

Google Colab Examples

CIFAR-10 and CIFAR-100

Training without teacher models
Knowledge distillation

These examples are available in demo/. Note that the examples are for Google Colab users, and usually examples/ would be a better reference if you have your own GPU(s).

Citation

[Preprint]

@article{matsubara2020torchdistill,
  title={torchdistill: A Modular, Configuration-Driven Framework for Knowledge Distillation},
  author={Matsubara, Yoshitomo},
  year={2020}
  eprint={2011.12913},
  archivePrefix={arXiv},
  primaryClass={cs.LG}
}

How to setup

Python 3.6 >=
pipenv (optional)

Install by pip/pipenv

pip3 install torchdistill
# or use pipenv
pipenv install torchdistill

Install from this repository

git clone https://github.com/yoshitomo-matsubara/torchdistill.git
cd torchdistill/
pip3 install -e .
# or use pipenv
pipenv install "-e ."

Issues / Contact

The documentation is work-in-progress. In the meantime, feel free to create an issue if you have a feature request or email me ( yoshitom@uci.edu ) if you would like to ask me in private.

References

:mag: pytorch/vision/references/classification/
:mag: pytorch/vision/references/detection/
:mag: pytorch/vision/references/segmentation/
:mag: Geoffrey Hinton, Oriol Vinyals and Jeff Dean. "Distilling the Knowledge in a Neural Network" (Deep Learning and Representation Learning Workshop: NeurIPS 2014)
:mag: Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta and Yoshua Bengio. "FitNets: Hints for Thin Deep Nets" (ICLR 2015)
:mag: Junho Yim, Donggyu Joo, Jihoon Bae and Junmo Kim. "A Gift From Knowledge Distillation: Fast Optimization, Network Minimization and Transfer Learning" (CVPR 2017)
:mag: Sergey Zagoruyko and Nikos Komodakis. "Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer" (ICLR 2017)
:mag: Nikolaos Passalis and Anastasios Tefas. "Learning Deep Representations with Probabilistic Knowledge Transfer" (ECCV 2018)
:mag: Jangho Kim, Seonguk Park and Nojun Kwak. "Paraphrasing Complex Network: Network Compression via Factor Transfer" (NeurIPS 2018)
:mag: Byeongho Heo, Minsik Lee, Sangdoo Yun and Jin Young Choi. "Knowledge Transfer via Distillation of Activation Boundaries Formed by Hidden Neurons" (AAAI 2019)
:mag: Wonpyo Park, Dongju Kim, Yan Lu and Minsu Cho. "Relational Knowledge Distillation" (CVPR 2019)
:mag: Sungsoo Ahn, Shell Xu Hu, Andreas Damianou, Neil D. Lawrence and Zhenwen Dai. "Variational Information Distillation for Knowledge Transfer" (CVPR 2019)
:mag: Yoshitomo Matsubara, Sabur Baidya, Davide Callegaro, Marco Levorato and Sameer Singh. "Distilled Split Deep Neural Networks for Edge-Assisted Real-Time Systems" (Workshop on Hot Topics in Video Analytics and Intelligent Edges: MobiCom 2019)
:mag: Baoyun Peng, Xiao Jin, Jiaheng Liu, Dongsheng Li, Yichao Wu, Yu Liu, Shunfeng Zhou and Zhaoning Zhang. "Correlation Congruence for Knowledge Distillation" (ICCV 2019)
:mag: Frederick Tung and Greg Mori. "Similarity-Preserving Knowledge Distillation" (ICCV 2019)
:mag: Yonglong Tian, Dilip Krishnan and Phillip Isola. "Contrastive Representation Distillation" (ICLR 2020)
:mag: Yoshitomo Matsubara and Marco Levorato. "Neural Compression and Filtering for Edge-assisted Real-time Object Detection in Challenged Networks" (ICPR 2020)
:mag: Li Yuan, Francis E.H.Tay, Guilin Li, Tao Wang and Jiashi Feng. "Revisiting Knowledge Distillation via Label Smoothing Regularization" (CVPR 2020)
:mag: Guodong Xu, Ziwei Liu, Xiaoxiao Li and Chen Change Loy. "Knowledge Distillation Meets Self-Supervision" (ECCV 2020)
:mag: Youcai Zhang, Zhonghao Lan, Yuchen Dai, Fangao Zeng, Yan Bai, Jie Chang and Yichen Wei. "Prime-Aware Adaptive Distillation" (ECCV 2020)

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

1.1.0

Mar 27, 2024

1.0.0

Nov 6, 2023

0.3.3

Nov 9, 2022

0.3.2

Mar 5, 2022

0.3.1

Dec 28, 2021

0.3.0

Dec 26, 2021

0.2.9

Dec 26, 2021

0.2.8

Dec 17, 2021

0.2.7

Oct 28, 2021

0.2.6

Sep 14, 2021

0.2.5

Aug 1, 2021

0.2.4

Jul 15, 2021

0.2.3

Jun 30, 2021

0.2.2

Jun 23, 2021

0.2.1

May 25, 2021

0.2.0

May 5, 2021

0.1.6

Apr 9, 2021

0.1.5

Mar 22, 2021

0.1.4

Feb 21, 2021

This version

0.1.3

Feb 6, 2021

0.1.2

Jan 11, 2021

0.1.1

Dec 27, 2020

0.1.0

Dec 27, 2020

0.0.2

Dec 4, 2020

0.0.1

Nov 25, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

torchdistill-0.1.3.tar.gz (58.3 kB view hashes)

Uploaded Feb 6, 2021 Source

Built Distribution

torchdistill-0.1.3-py3-none-any.whl (74.5 kB view hashes)

Uploaded Feb 6, 2021 Python 3

Hashes for torchdistill-0.1.3.tar.gz

Hashes for torchdistill-0.1.3.tar.gz
Algorithm	Hash digest
SHA256	`ca5c5f51006cb01e232e7ad3316690b8364fe599dc1bbdddc2bce3e201f1b07d`
MD5	`fb48ed7c0aaf1e59db2f80d284c90984`
BLAKE2b-256	`7e67588d720f93c58489d7e896b79e50ebb72a2fea7bb8d7dd0740a11925cf23`

Hashes for torchdistill-0.1.3-py3-none-any.whl

Hashes for torchdistill-0.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`ac834e0c38dcc82a466bbc21336a2ae363f498944fac2185e02237cba9ce47b0`
MD5	`aacd33b9cab5058aabfefc319b3f688a`
BLAKE2b-256	`fe2fb9394437cdfeac49c7e0c18652fac09c6fca6931eb0ca65809078342cd79`