PLSC is an open source repo for a collection of Paddle Large Scale Classification Tools, which supports large-scale classification model pre-training as well as finetune for downstream tasks.
Project description
Introduction
PLSC is an open source repo for a collection of Paddle Large Scale Classification Tools, which supports large-scale classification model pre-training as well as finetune for downstream tasks.
Available Models
Top News 🔥
Update (2023-01-11): PLSC v2.4 is released, we refactored the entire repository based on task types. This repository has been adapted to PaddlePaddle release 2.4. In terms of models, we have added 4 new ones, including FaceViT, CaiT, MoCo v3, MAE. At present, each model in the repository can be trained from scratch to achieve the original official accuracy, especially the training of ViT-Large on the ImageNet21K dataset. In addition, we also provide a method for ImageNet21K data preprocessing. In terms of AMP training, PLSC uses FP16 O2 training by default, which can speed up training while maintaining accuracy.
Update (2022-07-18): PLSC v2.3 is released, a new upgrade, more modular and highly extensible. Support more tasks, such as ViT, DeiT. The static
graph mode will no longer be maintained as of this release.
Update (2022-01-11): Supported NHWC data format of FP16 to improve 10% throughtput and decreased 30% GPU memory. It supported 92 million classes on single node 8 NVIDIA V100 (32G) and has high training throughtput. Supported best checkpoint save. And we released 18 pretrained models and PLSC v2.2.
Update (2021-12-11): Released Zhihu Technical Artical and Bilibili Open Class
Update (2021-10-10): Added FP16 training, improved throughtput and optimized GPU memory. It supported 60 million classes on single node 8 NVIDIA V100 (32G) and has high training throughtput.
Update (2021-09-10): This repository supported both static
mode and dynamic
mode to use paddlepaddle v2.2, which supported 48 million classes on single node 8 NVIDIA V100 (32G). It added PartialFC, SparseMomentum, and ArcFace, CosFace (we refer to MarginLoss). Backbone includes IResNet and MobileNet.
Installation
PLSC provides two usage methods: one is as an external third-party library, and users can use import plsc
in their own projects; the other is to develop and use it locally based on this repository.
Note: As the PaddlePaddle version continues to iterate, PLSC v2.4 adapts to PaddlePaddle v2.4, and there may be API mismatches in higher versions of PaddlePaddle.
Install plsc as a third-party library
pip install plsc==2.4
Install plsc locally
git clone https://github.com/PaddlePaddle/PLSC.git
cd /path/to/PLSC/
git checkout -b release/2.4 remotes/origin/release/2.4
# [optional] pip install -r requirements.txt
python setup.py develop
See Installation instructions.
Getting Started
See Quick Run Recognition for the basic usage of PLSC.
Tutorials
See more tutorials.
Documentation
See documentation for the usage of more APIs or modules.
License
This project is released under the Apache 2.0 license.
Citation
@misc{plsc,
title={PLSC: An Easy-to-use and High-Performance Large Scale Classification Tool},
author={PLSC Contributors},
howpublished = {\url{https://github.com/PaddlePaddle/PLSC}},
year={2022}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file plsc-2.4.0-py3-none-any.whl
.
File metadata
- Download URL: plsc-2.4.0-py3-none-any.whl
- Upload date:
- Size: 130.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.7.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | da40f46b18ecb6c8f96e4bedb02840d645db5568e168941f9841a0774b7b135f |
|
MD5 | c7196781b0086f35f36906e5c9a6c07e |
|
BLAKE2b-256 | f6f056c224a4b4facf1bd87ef75d3480569ec7d3f657da0be6f4d760b2295670 |