🚣 一些常用的但 paddle 里没有的小工具~
Project description
Paddle Toolbox [Early WIP]
一些方便的小工具,参考 Paddle 的 API 设计以及 Torch Toolbox API 设计
:warning: 目前正处于早期设计阶段,大多数功能的开发方案尚处于草案阶段~
安装
使用 pip 安装
注意:Python 需至少 3.7.0 版本,PaddlePaddle 需至少 2.3.0 版本(会跟随 paddle 版本变化)
pip install pptb==0.2.0
由于仍处于早期开发设计阶段,API 较为不稳定,安装时请一定要指定版本号
直接从 GitHub 拉取最新代码
这里以 AI Studio 为例
git clone https://github.com/cattidea/paddle-toolbox.git work/paddle-toolbox/
# 如果下载太慢导致出错请使用下面的命令
# git clone https://hub.fastgit.org/cattidea/paddle-toolbox.git work/paddle-toolbox/
之后在你的 Notebook 或者 Python 文件中加入以下代码
import sys
sys.path.append('/home/aistudio/work/paddle-toolbox/')
已支持的工具
LabelSmoothingLoss
import paddle
from pptb.nn import LabelSmoothingLoss, LabelSmoothingCrossEntropyLoss
label_smooth_epision = 0.1
loss_function = paddle.nn.CrossEntropyLoss()
# 如果需要标签平滑后 Loss,将下面这行替换成后面那一行即可
loss_function = LabelSmoothingLoss(
paddle.nn.CrossEntropyLoss(soft_label=True),
label_smooth_epision
)
# 由于 CrossEntropyLoss 的 LabelSmoothing 比较常用,因此也可以使用下面这个别名
loss_function = LabelSmoothingCrossEntropyLoss(label_smooth_epision)
CosineWarmup
import paddle
from pptb.optimizer.lr import CosineWarmup
# ...
train_batch_size = 32
learning_rate = 3e-4
step_each_epoch = len(train_set) // train_batch_size
num_epochs = 40
warmup_epochs = 3
lr_scheduler = CosineWarmup(
learning_rate,
total_steps = num_epochs * step_each_epoch,
warmup_steps = warmup_epochs * step_each_epoch,
warmup_start_lr = 0.0,
cosine_end_lr = 0.0,
last_epoch = -1
)
Mixup && Cutmix
Mixup
import paddle
from pptb.tools import mixup_data, mixup_criterion, mixup_metric
# ...
use_mixup = True
mixup_alpha = 0.2
for X_batch, y_batch in train_loader():
# 使用 mixup 与不使用 mixup 代码的前向传播部分代码差异对比
if use_mixup:
X_batch_mixed, y_batch_a, y_batch_b, lam = mixup_data(X_batch, y_batch, mixup_alpha)
predicts = model(X_batch_mixed)
loss = mixup_criterion(loss_function, predicts, y_batch_a, y_batch_b, lam)
acc = mixup_metric(paddle.metric.accuracy, predicts, y_batch_a, y_batch_b, lam)
else:
predicts = model(X_batch)
loss = loss_function(predicts, y_batch)
acc = paddle.metric.accuracy(predicts, y_batch)
# ...
除了用于处理 paddle 里 Tensor
的 mixup_data
,还可以使用 mixup_data_numpy
处理 numpy 的 ndarray。
Cutmix
和 Mixup 一样,只需要将 mixup_data
换为 cutmix_data
即可,与 mixup_data
不同的是,cutmix_data
还接收一个额外参数 axes
用于控制需要 mix 的是哪几根 axis,默认 axes = [2, 3]
,也即 NCHW
格式图片数据对应的 H
与 W
两根 axis。
MixingDataController
用于方便管理使用 Mixup 和 Cutmix
import paddle
from pptb.tools import MixingDataController
# ...
mixing_data_controller = MixingDataController(
mixup_prob=0.3,
cutmix_prob=0.3,
mixup_alpha=0.2,
cutmix_alpha=0.2,
cutmix_axes=[2, 3],
loss_function=paddle.nn.CrossEntropyLoss(),
metric_function=paddle.metric.accuracy,
)
for X_batch, y_batch in train_loader():
X_batch_mixed, y_batch_a, y_batch_b, lam = mixing_data_controller.mix(X_batch, y_batch, is_numpy=False)
predicts = model(X_batch_mixed)
loss = mixing_data_controller.loss(predicts, y_batch_a, y_batch_b, lam)
acc = mixing_data_controller.metric(predicts, y_batch_a, y_batch_b, lam)
# ...
Vision models
提供更加丰富的 backbone,所有模型均会提供预训练权重
合入 paddle 主线的模型会在新版本发布时移除,避免 API 不同步导致的问题
已支持一些 PaddleClas 下的预训练模型,以及比较新的 ConvMixer
- GoogLeNet(已并入 paddle 主线且已移除,请直接使用 paddle.vision.models.GoogLeNet)
- Incetpionv3(已并入 paddle 主线且已移除,请直接使用 paddle.vision.models.InceptionV3)
- ResNeXt(已并入 paddle 主线且已移除,请直接使用 paddle.vision.models.ResNet)
- ShuffleNetV2(已并入 paddle 主线且已移除,请直接使用 paddle.vision.models.ShuffleNetV2)
- MobileNetV3(已并入 paddle 主线且已移除,请直接使用 paddle.vision.models.MobileNetV3Large 和 paddle.vision.models.MobileNetV3Small)
- ConvMixer(预训练权重转自 PyTorch)
import paddle
import pptb.vision.models as ppmodels
model = ppmodels.convmixer_768_32(pretrained=True)
PS: 如果这些模型无法满足你的需求的话,可以试试囊括了很多比较新的模型的 ppim~
ConvMixer
Model Name | Kernel Size | Patch Size | Top-1 | Top-5 |
---|---|---|---|---|
convmixer_768_32 | 7 | 7 | 0.7974(-0.0042) | 0.9486 |
convmixer_1024_20_ks9_p14 | 9 | 14 | 0.7681(-0.0013) | 0.9335 |
convmixer_1536_20 | 9 | 7 | 0.8083(-0.0054) | 0.9557 |
TODO List
一些近期想做的功能
- Cutmix
- Activation、Mish
- RandomErasing
- AutoAugment、RandAugment
- Transform Layer(使用 Layer 实现某些 Transform)
- 更多 vision models
- Xception
- Swin Transformer
- CvT
- 完整的单元测试
References
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pptb-0.2.0.tar.gz
.
File metadata
- Download URL: pptb-0.2.0.tar.gz
- Upload date:
- Size: 16.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.13 CPython/3.10.4 Darwin/21.4.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8ed224a18d18cdafe0da959fe9100b0e07aac6a20b61b5688d8d87dbefbb441d |
|
MD5 | 87a82e6326bee8df2855a713e50c7af0 |
|
BLAKE2b-256 | 2d9b72933ed419c12e49c74bd58fdea33c050c9d985bf57a82c33cd65bf6832d |
File details
Details for the file pptb-0.2.0-py3-none-any.whl
.
File metadata
- Download URL: pptb-0.2.0-py3-none-any.whl
- Upload date:
- Size: 11.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.1.13 CPython/3.10.4 Darwin/21.4.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fc2d27881d75acfbafe51200a05faa2829f4f44890bc9ff781c6a1712d384e14 |
|
MD5 | 2e0e87d7b16b43f8903f072de0bf0b6b |
|
BLAKE2b-256 | 508bae9e3cbade7c85e3eacd8397188280feec13847577715b35ce2d5ad0e0f1 |