Skip to main content

Torchvision+ Deformable Convolutional Networks

Project description

Torchvision+ Deformable Convolution Networks

GitHub Workflow Status PyPI GitHub

This package contains the PyTorch implementations of the 2D Deformable Convolution operation (the commonly used torchvision.ops.deform_conv2d) proposed in https://arxiv.org/abs/1811.11168, as well as its 1D and 3D equivalences, which are not available in torchvision (thus the name).

And beyond that, the package also provides the transposed versions of them, which interestingly noone has ever proposed to use. The main idea is, while offset in deformable convolution guides the convolution kernel where to get the inputs to compute the output; in transposed deformable convolution, it guides the convolution kernel where to write the outputs.

Highlights

  • Supported operations: (All are implemented in C++/Cuda)

    • tvdcn.ops.deform_conv1d
    • tvdcn.ops.deform_conv2d
    • tvdcn.ops.deform_conv3d
    • tvdcn.ops.deform_conv_transpose1d
    • tvdcn.ops.deform_conv_transpose2d
    • tvdcn.ops.deform_conv_transpose3d
  • And the following supplementary operations (for activating mask):

    • tvdcn.ops.mask_softmax1d
    • tvdcn.ops.mask_softmax2d
    • tvdcn.ops.mask_softmax3d
  • Both offset and mask can be turned off, and can be applied in separate groups.

  • All the nn.Module wrappers for these operations are implemented, everything is @torch.jit.script-able! Please check Usage.

Note: We don't care much about onnx exportation, but if you do, you can check this repo: https://github.com/masamitsu-murase/deform_conv2d_onnx_exporter.

Requirements

  • torch>=1.9.0

Installation

From PyPI:

tvdcn provides some prebuilt wheels on PyPI. Run this command to install:

pip install tvdcn

The Linux and Windows wheels are built with Cuda 11.8. If you cannot find a wheel for your Arch/Python/Cuda, or there is any problem with library linking when importing, please proceed to instructions to build from source, all steps are super easy.

Linux/Windows MacOS
Python version: 3.8-3.11 3.8-3.11
PyTorch version: torch==2.0.1 torch==2.0.1
Cuda version: 11.8 -
GPU CCs: 3.7,5.0,6.0,6.1,7.0,7.5,8.0,8.6,8.9,9.0+PTX -

From Source:

For installing from source, you need a C++ compiler (gcc/msvc) and a Cuda compiler (nvcc) with C++17 features enabled. Clone this repo and execute the following command:

pip install .

Or just compile the binary for inplace usage:

python setup.py build_ext --inplace

A binary (.so file for Unix and .pyd file for Windows) should be compiled inside the tvdcn folder. To check if installation is successful, try:

import tvdcn

print('Library loaded successfully:', tvdcn.has_ops())
print('Compiled with Cuda:', tvdcn.with_cuda())

Note: We use soft Cuda version compatibility checking between the built binary and the installed PyTorch, which means only major version matching is required. However, we suggest building the binaries with the same Cuda version with installed PyTorch's Cuda version to prevent any possible conflict.

Usage

Functions:

Functionally, the package offers 6 functions (listed in Highlights) much similar to torchvision.ops.deform_conv2d. However, the order of parameters is slightly different, so be cautious (check this comparison). Specifically, the signatures of deform_conv2d and deform_conv_transpose2d look like this:

def deform_conv2d(
        input: Tensor,
        weight: Tensor,
        offset: Optional[Tensor] = None,
        mask: Optional[Tensor] = None,
        bias: Optional[Tensor] = None,
        stride: Union[int, Tuple[int, int]] = 1,
        padding: Union[int, Tuple[int, int]] = 0,
        dilation: Union[int, Tuple[int, int]] = 1,
        groups: int = 1) -> Tensor:
    ...


def deform_conv_transpose2d(
        input: Tensor,
        weight: Tensor,
        offset: Optional[Tensor] = None,
        mask: Optional[Tensor] = None,
        bias: Optional[Tensor] = None,
        stride: Union[int, Tuple[int, int]] = 1,
        padding: Union[int, Tuple[int, int]] = 0,
        output_padding: Union[int, Tuple[int, int]] = 0,
        dilation: Union[int, Tuple[int, int]] = 1,
        groups: int = 1) -> Tensor:
    ...

If offset=None and mask=None, the executed operations are identical to conventional convolution.

Neural Network Layers:

The nn.Module wrappers are:

  • tvdcn.ops.DeformConv1d
  • tvdcn.ops.DeformConv2d
  • tvdcn.ops.DeformConv3d
  • tvdcn.ops.DeformConvTranspose1d
  • tvdcn.ops.DeformConvTranspose2d
  • tvdcn.ops.DeformConvTranspose3d

They are subclasses of the torch.nn.modules._ConvNd, but you have to specify offset and optionally mask as extra inputs for the forward function. For example:

import torch

from tvdcn import DeformConv2d

input = torch.rand(2, 3, 64, 64)
offset = torch.rand(2, 2 * 3 * 3, 62, 62)
# if mask is None, perform the original deform_conv without modulation (v2)
mask = torch.rand(2, 1 * 3 * 3, 62, 62)

conv = DeformConv2d(3, 16, kernel_size=(3, 3))

output = conv(input, offset, mask)
print(output.shape)

Additionally, following many other implementations out there, we also implemented the packed wrappers:

  • tvdcn.ops.PackedDeformConv1d
  • tvdcn.ops.PackedDeformConv2d
  • tvdcn.ops.PackedDeformConv3d
  • tvdcn.ops.PackedDeformConvTranspose1d
  • tvdcn.ops.PackedDeformConvTranspose2d
  • tvdcn.ops.PackedDeformConvTranspose3d

These are easy-to-use classes that contain ordinary convolution layers with appropriate hyperparameters to generate offset (and mask if initialized with modulated=True); but that means less customization. The only tunable hyperparameters that effect these supplementary conv layers are offset_groups and mask_groups, which have been decoupled from and behave somewhat similar to groups.

To use the softmax activation for mask proposed in Deformable Convolution v3, set mask_activation='softmax'. offset_activation and mask_activation also accept any nn.Module.

import torch

from tvdcn import PackedDeformConv1d

input = torch.rand(2, 3, 128)

conv = PackedDeformConv1d(3, 16,
                          kernel_size=5,
                          modulated=True,
                          mask_activation='softmax')
# jit scripting
scripted_conv = torch.jit.script(conv)
print(scripted_conv)

output = scripted_conv(input)
print(output.shape)

Note: For transposed packed modules, we are generating offset and mask with pointwise convolution as we haven't found a better way to do it.

Check the examples folder, maybe you can find something helpful.

Acknowledgements

This for fun project is directly modified and extended from torchvision.ops.deform_conv2d.

License

The code is released under the MIT license. See LICENSE.txt for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tvdcn-0.3.3.tar.gz (51.4 kB view details)

Uploaded Source

Built Distributions

tvdcn-0.3.3-cp311-cp311-win_amd64.whl (10.0 MB view details)

Uploaded CPython 3.11 Windows x86-64

tvdcn-0.3.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (20.9 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

tvdcn-0.3.3-cp311-cp311-macosx_10_9_x86_64.whl (477.4 kB view details)

Uploaded CPython 3.11 macOS 10.9+ x86-64

tvdcn-0.3.3-cp310-cp310-win_amd64.whl (10.0 MB view details)

Uploaded CPython 3.10 Windows x86-64

tvdcn-0.3.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (20.9 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

tvdcn-0.3.3-cp310-cp310-macosx_10_9_x86_64.whl (477.4 kB view details)

Uploaded CPython 3.10 macOS 10.9+ x86-64

tvdcn-0.3.3-cp39-cp39-win_amd64.whl (10.0 MB view details)

Uploaded CPython 3.9 Windows x86-64

tvdcn-0.3.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (20.9 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

tvdcn-0.3.3-cp39-cp39-macosx_10_9_x86_64.whl (477.3 kB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

tvdcn-0.3.3-cp38-cp38-win_amd64.whl (10.0 MB view details)

Uploaded CPython 3.8 Windows x86-64

tvdcn-0.3.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (20.9 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

tvdcn-0.3.3-cp38-cp38-macosx_10_9_x86_64.whl (477.3 kB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

File details

Details for the file tvdcn-0.3.3.tar.gz.

File metadata

  • Download URL: tvdcn-0.3.3.tar.gz
  • Upload date:
  • Size: 51.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.17

File hashes

Hashes for tvdcn-0.3.3.tar.gz
Algorithm Hash digest
SHA256 c0edbbac94f478e1ba42813c937c67b5fc2eeb9053f4e4a7db1eab0799c44773
MD5 d91c1b477165d2f0391cff9298b3bffb
BLAKE2b-256 e1b8fa2fe63049ef4cb7dc590e727ed4e9aef50a70248efe827fe51998c63789

See more details on using hashes here.

File details

Details for the file tvdcn-0.3.3-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: tvdcn-0.3.3-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 10.0 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.17

File hashes

Hashes for tvdcn-0.3.3-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 3ccfcb347940eba28a89622c562af736765fe6168c1a942bd0c09dd680161bc1
MD5 b2cd648e7076318519c88443c1e0024f
BLAKE2b-256 f760a3df22f24148a4c1626bc64cd4b1bec57cca2424f6c1e8de8866feaf5e7f

See more details on using hashes here.

File details

Details for the file tvdcn-0.3.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tvdcn-0.3.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 03f0512805affb22bf061893a79184842bc8ee78b2eb654fba0a97d9b84e3dff
MD5 81f539601580b53bbc40fa2b3caaf667
BLAKE2b-256 f8517727b0776800797e74d02a93d91ec827873e49241df968c6040c3ac9d303

See more details on using hashes here.

File details

Details for the file tvdcn-0.3.3-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for tvdcn-0.3.3-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 dc04247fc2248b1cdf1fd3e74757eb0d618aa2b2739ee8b5bfbadd40241d98f1
MD5 26ef4f0e206b33dbbddbfc9190adf964
BLAKE2b-256 69690f5ebbe93c7c34a44e4eb6dcf282607c15a13e10ddfc237ae7d0dd83958e

See more details on using hashes here.

File details

Details for the file tvdcn-0.3.3-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: tvdcn-0.3.3-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 10.0 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.17

File hashes

Hashes for tvdcn-0.3.3-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 2123409bb284f544ef92b2f24223ad8d9388832a569417748845ba7ec1c969cd
MD5 6244a75d37da68aead7cd677239e8683
BLAKE2b-256 a677beb8bcfb168688e9c95aef8932467ff115791c283caecd2347c01acc8311

See more details on using hashes here.

File details

Details for the file tvdcn-0.3.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tvdcn-0.3.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 df55e69c6166050bef6d87439c1aaac069ad55385e138add4d1e636637af0853
MD5 3767cffb0a0b62de2be9d7795245ebe6
BLAKE2b-256 d396c9a17e21a761ea65ced4b0b4d46074286259702a05106c062f5e4c261c0c

See more details on using hashes here.

File details

Details for the file tvdcn-0.3.3-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for tvdcn-0.3.3-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 44381d3921ef0d4340149c6a482d3cd76f5481a40caf4dcc60d87be8cd9fa0e3
MD5 bcdc51a74a52796e61e000ba65e370da
BLAKE2b-256 013eeb78535afc8945f62ff0409763e24f3074dca38c54fe3ed786c517d2a3f5

See more details on using hashes here.

File details

Details for the file tvdcn-0.3.3-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: tvdcn-0.3.3-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 10.0 MB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.17

File hashes

Hashes for tvdcn-0.3.3-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 5aaa76fab3056d4f84dd1933ea0b0eb9c133d14ad8cfeb1bd9b0fe43e2b1d3ed
MD5 8546e57377a25c139304097e56c8e9cc
BLAKE2b-256 d728b0d8316a12b81fa921343d6992c9262ddd3dfb2c965d8d096f578bf69c5d

See more details on using hashes here.

File details

Details for the file tvdcn-0.3.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tvdcn-0.3.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f2c15bf9772d0bcbaf827b5ff525bfeea574f6c73375945787e8f7ac213ffbb1
MD5 c99ef1c90bfe67201af9a04b6f758e76
BLAKE2b-256 0cdcf9b4b2713457822b78c0db261021bb60dc50027c611ba33678578e423e3d

See more details on using hashes here.

File details

Details for the file tvdcn-0.3.3-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for tvdcn-0.3.3-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 21aa627295f3ea4eec48784c87f079793bee3ea459ee5ce969c43f8df69bb07a
MD5 ccb2cdc67c456fcc6f0ae68fdf08c2b6
BLAKE2b-256 0e6e5c862bbf7a037e47d882d23684b600b0d7e1a4b78595e32bdce3625ff1d5

See more details on using hashes here.

File details

Details for the file tvdcn-0.3.3-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: tvdcn-0.3.3-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 10.0 MB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.17

File hashes

Hashes for tvdcn-0.3.3-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 2e7f4b1b3416a5ab55f5e861d1e54c34765efb7dc41ab8171c96552317e95442
MD5 196f541492cc7679b817068107d80b89
BLAKE2b-256 6a7406418e682ed002073dfd498df95e459a890a6203c09520df346bd851093c

See more details on using hashes here.

File details

Details for the file tvdcn-0.3.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tvdcn-0.3.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b77e09a10316f59bbb42a7e7f3a4d807d0dc779961cec67b86e66e7b8c4e50dd
MD5 29bb38bc89c8034834edf682ca576092
BLAKE2b-256 05d392e583d845474feffa1202638dcbede10256660b1e0da2d1608bf6da762c

See more details on using hashes here.

File details

Details for the file tvdcn-0.3.3-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for tvdcn-0.3.3-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 60a80e23a3ecffc9cc6556288227ff8f27b003cbf12a14aba3a4ac592f8eab45
MD5 b9975282a1fb0cb2297191cc52776fb5
BLAKE2b-256 9ffa5dd4c3bd3344796f4412654de87ab9ddcd9fb6f2e28a6ddfd970bfc3e415

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page