Skip to main content

A PyTorch landmarks-only library with 100+ data augmentations, training and inference, can easily install with pip and compatible with albumentations and torchvision.

Project description

torchlm-logo

English | 中文文档 | 知乎专栏 | 下载统计

🤗 Introduction

torchlm is aims to build a high level pipeline for face landmarks detection, it supports training, inference and 100+ data augmentations, can easily install with pip.

❤️ Star 🌟👆🏻 this repo to support me if it does any helps to you, thanks ~

👋 Core Features

  • High level pipeline for training and inference.
  • Provides 30+ native landmarks data augmentations.
  • Can bind 80+ transforms from torchvision and albumentations with one-line-code.
  • Support awesome models for face landmarks detection, such as YOLOX, YOLOv5, ResNet, MobileNet, ShuffleNet and PIPNet, etc.

🆕 What's New

✅ Supported Models Matrix

✅ = known work and official supported, ❔ = in my plan, but not coming soon.

PIPNet YOLOX YOLOv5 NanoDet ResNet MobileNet ShuffleNet VIT ...

🔥🔥Performance(@NME)

Model Backbone Head 300W COFW AFLW WFLW Download
PIPNet MobileNetV2 Heatmap+Regression+NRM 3.40 3.43 1.52 4.79 link
PIPNet ResNet18 Heatmap+Regression+NRM 3.36 3.31 1.48 4.47 link
PIPNet ResNet50 Heatmap+Regression+NRM 3.34 3.18 1.44 4.48 link
PIPNet ResNet101 Heatmap+Regression+NRM 3.19 3.08 1.42 4.31 link

🛠️Installation

you can install torchlm directly from pypi.

pip3 install torchlm
# install from specific pypi mirrors use '-i'
pip3 install torchlm -i https://pypi.org/simple/

or install from source if you want the latest torchlm and install it in editable mode with -e.

# clone torchlm repository locally if you want the latest torchlm
git clone --depth=1 https://github.com/DefTruth/torchlm.git 
cd torchlm
# install in editable mode
pip install -e .

🌟🌟Data Augmentation

torchlm provides 30+ native data augmentations for landmarks and can bind with 80+ transforms from torchvision and albumentations through torchlm.bind method. The layout format of landmarks is xy with shape (N, 2), N denotes the number of the input landmarks.

  • use almost 30+ native transforms from torchlm directly
import torchlm
transform = torchlm.LandmarksCompose([
    torchlm.LandmarksRandomScale(prob=0.5),
    torchlm.LandmarksRandomMask(prob=0.5),
    torchlm.LandmarksRandomBlur(kernel_range=(5, 25), prob=0.5),
    torchlm.LandmarksRandomBrightness(prob=0.),
    torchlm.LandmarksRandomRotate(40, prob=0.5, bins=8),
    torchlm.LandmarksRandomCenterCrop((0.5, 1.0), (0.5, 1.0), prob=0.5)
])

Also, a user-friendly API build_default_transform is available to build a default transform pipeline.

transform = torchlm.build_default_transform(
    input_size=(input_size, input_size),
    mean=[0.485, 0.456, 0.406],
    std=[0.229, 0.224, 0.225],
    force_norm_before_mean_std=True,  # img/=255. first
    rotate=30,
    keep_aspect=False,
    to_tensor=True  # array -> Tensor & HWC -> CHW
)

See transforms.md for supported transforms sets and more example can be found at test/transforms.py.

bind 80+ torchvision and albumentations's transforms

NOTE: Please install albumentations first if you want to bind albumentations's transforms. If you have the conflict problem between different installed version of opencv (opencv-python and opencv-python-headless, ablumentations need opencv-python-headless). Please uninstall the opencv-python and opencv-python-headless first, and then reinstall albumentations. See albumentations#1140 for more details.

# first uninstall confilct opencvs
pip uninstall opencv-python
pip uninstall opencv-python-headless
pip uninstall albumentations  # if you have installed albumentations
# then reinstall torchlm
pip install albumentations # will also install deps, e.g opencv

Then, check albumentations whether is available.

torchlm.albumentations_is_available()
transform = torchlm.LandmarksCompose([
    torchlm.bind(torchvision.transforms.GaussianBlur(kernel_size=(5, 25)), prob=0.5),  
    torchlm.bind(albumentations.ColorJitter(p=0.5))
])
bind custom callable array or Tensor transform functions
# First, defined your custom functions
def callable_array_noop(img: np.ndarray, landmarks: np.ndarray) -> Tuple[np.ndarray, np.ndarray]: # do some transform here ...
    return img.astype(np.uint32), landmarks.astype(np.float32)

def callable_tensor_noop(img: Tensor, landmarks: Tensor) -> Tuple[Tensor, Tensor]: # do some transform here ...
    return img, landmarks
# Then, bind your functions and put it into the transforms pipeline.
transform = torchlm.LandmarksCompose([
        torchlm.bind(callable_array_noop, bind_type=torchlm.BindEnum.Callable_Array),
        torchlm.bind(callable_tensor_noop, bind_type=torchlm.BindEnum.Callable_Tensor, prob=0.5)
])
some global debug setting for torchlm's transform
  • setup logging mode as True globally might help you figure out the runtime details
# some global setting
torchlm.set_transforms_debug(True)
torchlm.set_transforms_logging(True)
torchlm.set_autodtype_logging(True)

some detail information will show you at each runtime, the infos might look like

LandmarksRandomScale() AutoDtype Info: AutoDtypeEnum.Array_InOut
LandmarksRandomScale() Execution Flag: False
BindTorchVisionTransform(GaussianBlur())() AutoDtype Info: AutoDtypeEnum.Tensor_InOut
BindTorchVisionTransform(GaussianBlur())() Execution Flag: True
BindAlbumentationsTransform(ColorJitter())() AutoDtype Info: AutoDtypeEnum.Array_InOut
BindAlbumentationsTransform(ColorJitter())() Execution Flag: True
BindTensorCallable(callable_tensor_noop())() AutoDtype Info: AutoDtypeEnum.Tensor_InOut
BindTensorCallable(callable_tensor_noop())() Execution Flag: False
Error at LandmarksRandomTranslate() Skip, Flag: False Error Info: LandmarksRandomTranslate() have 98 input landmarks, but got 96 output landmarks!
LandmarksRandomTranslate() Execution Flag: False
  • Execution Flag: True means current transform was executed successful, False means it was not executed because of the random probability or some Runtime Exceptions(torchlm will should the error infos if debug mode is True).

  • AutoDtype Info:

    • Array_InOut means current transform need a np.ndnarray as input and then output a np.ndarray.
    • Tensor_InOut means current transform need a torch Tensor as input and then output a torch Tensor.
    • Array_In means current transform needs a np.ndarray input and then output a torch Tensor.
    • Tensor_In means current transform needs a torch Tensor input and then output a np.ndarray.

    But, is ok if you pass a Tensor to a np.ndarray-like transform, torchlm will automatically be compatible with different data types and then wrap it back to the original type through a autodtype wrapper.

more details about transform in torchlm

Further, torchlm.bind provide a prob param at bind-level to force any transform or callable be a random-style augmentation. The data augmentations in torchlm are safe and simplest. Any transform operations at runtime cause landmarks outside will be auto dropped to keep the number of landmarks unchanged.

🎉🎉Training

In torchlm, each model have two high level and user-friendly APIs named apply_training and apply_freezing for training. apply_training handle the training process and apply_freezing decide whether to freeze the backbone for fune-tuning.

Quick Start

Here is a example of PIPNet. You can freeze backbone before fine-tuning through apply_freezing.

from torchlm.models import pipnet
# will auto download pretrained weights from latest release if pretrained=True
model = pipnet(backbone="resnet18", pretrained=True, num_nb=10, num_lms=98, net_stride=32,
               input_size=256, meanface_type="wflw", backbone_pretrained=True)
model.apply_freezing(backbone=True)
model.apply_training(
    annotation_path="../data/WFLW/convertd/train.txt",  # or fine-tuning your custom data
    num_epochs=10,
    learning_rate=0.0001,
    save_dir="./save/pipnet",
    save_prefix="pipnet-wflw-resnet18",
    save_interval=1,
    logging_interval=1,
    device="cuda",
    batch_size=16,
    num_workers=4,
    shuffle=True
)

Please jump to the entry point of the function for the detail documentations of apply_training API for each defined models in torchlm, e.g pipnet/_impls.py#L166. You might see some logs if the training process is running:

Parameters for DataLoader:  {'batch_size': 16, 'num_workers': 4, 'shuffle': True}
Built _PIPTrainDataset: train count is 7500 !
Epoch 0/9
----------
[Epoch 0/9, Batch 0/468] <Total loss: 0.968761> <cls loss: 0.115902> <x loss: 0.154434> <y loss: 0.217170> <nbx loss: 0.200751> <nby loss: 0.280504>
[Epoch 0/9, Batch 1/468] <Total loss: 0.529577> <cls loss: 0.082347> <x loss: 0.113045> <y loss: 0.083137> <nbx loss: 0.159639> <nby loss: 0.091410>
[Epoch 0/9, Batch 2/468] <Total loss: 0.764886> <cls loss: 0.094967> <x loss: 0.139947> <y loss: 0.142193> <nbx loss: 0.189724> <nby loss: 0.198055>
[Epoch 0/9, Batch 3/468] <Total loss: 0.607258> <cls loss: 0.081174> <x loss: 0.108801> <y loss: 0.125346> <nbx loss: 0.134875> <nby loss: 0.157063>

Dataset Format

The annotation_path parameter is denotes the path to a custom annotation file, the format must be:

"img0_path x0 y0 x1 y1 ... xn-1,yn-1"
"img1_path x0 y0 x1 y1 ... xn-1,yn-1"
"img2_path x0 y0 x1 y1 ... xn-1,yn-1"
"img3_path x0 y0 x1 y1 ... xn-1,yn-1"
...

If the label in annotation_path is already normalized by image size, please set coordinates_already_normalized as True in apply_training API.

"img0_path x0/w y0/h x1/w y1/h ... xn-1/w,yn-1/h"
"img1_path x0/w y0/h x1/w y1/h ... xn-1/w,yn-1/h"
"img2_path x0/w y0/h x1/w y1/h ... xn-1/w,yn-1/h"
"img3_path x0/w y0/h x1/w y1/h ... xn-1/w,yn-1/h"
...

Here is a example of WFLW to show you how to prepare the dataset, also see test/data.py.

How to train PIPNet in your own dataset and custom meanface settings?

Setting up your custom meanface and nearest-neighbor landmarks through set_custom_meanface method, this method will calculate the distance between landmarks in meanface and auto setup the nearest-neighbors for each landmark. NOTE: The PIPNet will reshape the detection headers if the number of landmarks in custom dataset is not equal with the num_lms you initialized.

def set_custom_meanface(custom_meanface_file_or_string: str) -> bool:
    """
    :param custom_meanface_file_or_string: a long string or a file contains normalized
    or un-normalized meanface coords, the format is "x0,y0,x1,y1,x2,y2,...,xn-1,yn-1".
    :return: status, True if successful.
    """

Also, a generate_meanface API is available in torchlm to help you get meanface in your custom dataset.

# generate your custom meanface.
custom_meanface, custom_meanface_string = torchlm.data.annotools.generate_meanface(
  annotation_path="../data/WFLW/convertd/train.txt")
# check your generated meanface.
rendered_meanface = torchlm.data.annotools.draw_meanface(meanface=custom_meanface)
cv2.imwrite("./logs/wflw_meanface.jpg", rendered_meanface)
# setting up your custom meanface
model.set_custom_meanface(custom_meanface_file_or_string=custom_meanface_string)

👀👇 Inference

C++ API

The ONNXRuntime(CPU/GPU), MNN, NCNN and TNN C++ inference of torchlm will be release at lite.ai.toolkit.

Python API

In torchlm, a high level API named runtime.bind can bind face detection and landmarks models together, then you can run the runtime.forward API to get the output landmarks and bboxes, here is a example of PIPNet. Pretrained weights of PIPNet, Download.

import torchlm
from torchlm.tools import faceboxesv2
from torchlm.models import pipnet

torchlm.runtime.bind(faceboxesv2())
torchlm.runtime.bind(
  pipnet(backbone="resnet18", pretrained=True,  
         num_nb=10, num_lms=98, net_stride=32, input_size=256,
         meanface_type="wflw", map_location="cpu", checkpoint=None)
) # will auto download pretrained weights from latest release if pretrained=True
landmarks, bboxes = torchlm.runtime.forward(image)
image = torchlm.utils.draw_bboxes(image, bboxes=bboxes)
image = torchlm.utils.draw_landmarks(image, landmarks=landmarks)

📖 Documentations

🎓 License

The code of torchlm is released under the MIT License.

❤️ Contribution

Please consider ⭐ this repo if you like it, as it is the simplest way to support me.

👋 Acknowledgement

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

torchlm-0.1.6.4.tar.gz (8.8 MB view hashes)

Uploaded Source

Built Distribution

torchlm-0.1.6.4-py3-none-any.whl (8.8 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page