Skip to main content

Torchvision+ Deformable Convolutional Networks

Project description

Torchvision+ Deformable Convolution Networks

GitHub Workflow Status PyPI Downloads GitHub DOI

This package contains the PyTorch implementations of the Deformable Convolution operation (the commonly used torchvision.ops.deform_conv2d) proposed in https://arxiv.org/abs/1811.11168, and the Transposed Deformable Convolution proposed in https://arxiv.org/abs/2210.09446 (currently without interpolation kernel scaling). It also supports their 1D and 3D equivalences, which are not available in torchvision (thus the name).

Highlights

  • Supported operators: (All are implemented in C++/Cuda)

    • tvdcn.ops.deform_conv1d
    • tvdcn.ops.deform_conv2d (faster than torchvision.ops.deform_conv2d by approximately 25% on forward pass and 14% on backward pass using a GeForce RTX 4060 according to this test)
    • tvdcn.ops.deform_conv3d
    • tvdcn.ops.deform_conv_transpose1d
    • tvdcn.ops.deform_conv_transpose2d
    • tvdcn.ops.deform_conv_transpose3d
  • And the following supplementary operators (mask activation proposed in https://arxiv.org/abs/2211.05778):

    • tvdcn.ops.mask_softmax1d
    • tvdcn.ops.mask_softmax2d
    • tvdcn.ops.mask_softmax3d
  • Both offset and mask can be turned off, and can be applied in separate groups.

  • All the nn.Module wrappers for these operators are implemented, everything is @torch.jit.script-able! Please check Usage.

Note: tvdcn doesn't support onnx exportation.

Requirements

  • torch>=2.7.0,<2.8.0 (torch>=1.9.0 if installed from source)

Notes:

Since torch extensions are not forward compatible, I have to fix a maximum version for the PyPI package and regularly update it on GitHub (but I am not always available). If you use a different version of torch or your platform is not supported, please follow the instructions to install from source.

Installation

From PyPI:

tvdcn provides some prebuilt wheels on PyPI. Run this command to install:

pip install tvdcn

Our Linux and Windows wheels are built with Cuda 12.8 but should be compatible with all 12.x versions.

Linux/Windows MacOS
Python version: 3.9-3.13 3.9-3.13
PyTorch version: torch==2.7.0 torch==2.7.0
Cuda version: 12.8 -
GPU CCs: 5.0,6.0,6.1,7.0,7.5,8.0,8.6,9.0,10.0,12.0+PTX -

When the Cuda versions of torch and tvdcn mismatch, you will see an error like this:

RuntimeError: Detected that PyTorch and Extension were compiled with different CUDA versions.
PyTorch has CUDA Version=11.8 and Extension has CUDA Version=12.8.
Please reinstall the Extension that matches your PyTorch install.

If you see this error instead, that means there are other issues related to Python, PyTorch, device arch, e.t.c. Please proceed to instructions to build from source, all steps are super easy.

RuntimeError: Couldn't load custom C++ ops. Recompile C++ extension with:
     python setup.py build_ext --inplace

From Source:

For installing from source, you need a C++ compiler (gcc/msvc) and a Cuda compiler (nvcc) with C++17 features enabled. Clone this repo and execute the following command:

pip install .

Or just compile the binary for inplace usage:

python setup.py build_ext --inplace

A binary (.so file for Unix and .pyd file for Windows) should be compiled inside the tvdcn folder. To check if installation is successful, try:

import tvdcn

print('Library loaded successfully:', tvdcn.has_ops())
print('Compiled with Cuda:', tvdcn.with_cuda())
print('Cuda version:', tvdcn.cuda_version())
print('Cuda arch list:', tvdcn.cuda_arch_list())

Note: We use soft Cuda version compatibility checking between the built binary and the installed PyTorch, which means only major version matching is required. However, we suggest building the binaries with the same Cuda version with installed PyTorch's Cuda version to prevent any possible conflict.

Usage

Operators:

Functionally, the package offers 6 functions (listed in Highlights) much similar to torchvision.ops.deform_conv2d. However, the order of parameters is slightly different, so be cautious (check this comparison).

torchvision tvdcn
import torch
from torchvision.ops import deform_conv2d

input = torch.rand(4, 3, 10, 10)
kh, kw = 3, 3
weight = torch.rand(5, 3, kh, kw)
offset = torch.rand(4, 2 * kh * kw, 8, 8)
mask = torch.rand(4, kh * kw, 8, 8)
bias = torch.rand(5)

output = deform_conv2d(input, offset, weight, bias,
                       stride=(1, 1),
                       padding=(0, 0),
                       dilation=(1, 1),
                       mask=mask)
print(output)
import torch
from tvdcn.ops import deform_conv2d

input = torch.rand(4, 3, 10, 10)
kh, kw = 3, 3
weight = torch.rand(5, 3, kh, kw)
offset = torch.rand(4, 2 * kh * kw, 8, 8)
mask = torch.rand(4, kh * kw, 8, 8)
bias = torch.rand(5)

output = deform_conv2d(input, weight, offset, mask, bias,
                       stride=(1, 1),
                       padding=(0, 0),
                       dilation=(1, 1),
                       groups=1)
print(output)

Specifically, the signatures of deform_conv2d and deform_conv_transpose2d look like these:

def deform_conv2d(
        input: Tensor,
        weight: Tensor,
        offset: Optional[Tensor] = None,
        mask: Optional[Tensor] = None,
        bias: Optional[Tensor] = None,
        stride: Union[int, Tuple[int, int]] = 1,
        padding: Union[int, Tuple[int, int]] = 0,
        dilation: Union[int, Tuple[int, int]] = 1,
        groups: int = 1) -> Tensor:
    ...


def deform_conv_transpose2d(
        input: Tensor,
        weight: Tensor,
        offset: Optional[Tensor] = None,
        mask: Optional[Tensor] = None,
        bias: Optional[Tensor] = None,
        stride: Union[int, Tuple[int, int]] = 1,
        padding: Union[int, Tuple[int, int]] = 0,
        output_padding: Union[int, Tuple[int, int]] = 0,
        dilation: Union[int, Tuple[int, int]] = 1,
        groups: int = 1) -> Tensor:
    ...

If offset=None and mask=None, the executed operators are identical to conventional convolution.

Neural Network Layers:

The nn.Module wrappers are:

  • tvdcn.ops.DeformConv1d
  • tvdcn.ops.DeformConv2d
  • tvdcn.ops.DeformConv3d
  • tvdcn.ops.DeformConvTranspose1d
  • tvdcn.ops.DeformConvTranspose2d
  • tvdcn.ops.DeformConvTranspose3d

They are subclasses of the torch.nn.modules._ConvNd, but you have to specify offset and optionally mask as extra inputs for the forward function. For example:

import torch

from tvdcn import DeformConv2d

input = torch.rand(2, 3, 64, 64)
offset = torch.rand(2, 2 * 3 * 3, 62, 62)
# if mask is None, perform the original deform_conv without modulation (v2)
mask = torch.rand(2, 1 * 3 * 3, 62, 62)

conv = DeformConv2d(3, 16, kernel_size=(3, 3))

output = conv(input, offset, mask)
print(output.shape)

Additionally, following many other implementations out there, we also implemented the packed wrappers:

  • tvdcn.ops.PackedDeformConv1d
  • tvdcn.ops.PackedDeformConv2d
  • tvdcn.ops.PackedDeformConv3d
  • tvdcn.ops.PackedDeformConvTranspose1d
  • tvdcn.ops.PackedDeformConvTranspose2d
  • tvdcn.ops.PackedDeformConvTranspose3d

These are easy-to-use classes that contain ordinary convolution layers with appropriate hyperparameters to generate offset (and mask if initialized with modulated=True); but that means less customization. The only tunable hyperparameters that effect these supplementary conv layers are offset_groups and mask_groups, which have been decoupled from and behave somewhat similar to groups.

To use the softmax activation for mask proposed in Deformable Convolution v3, set mask_activation='softmax'. offset_activation and mask_activation also accept any nn.Module.

import torch

from tvdcn import PackedDeformConv1d

input = torch.rand(2, 3, 128)

conv = PackedDeformConv1d(3, 16,
                          kernel_size=5,
                          modulated=True,
                          mask_activation='softmax')
# jit scripting
scripted_conv = torch.jit.script(conv)
print(scripted_conv)

output = scripted_conv(input)
print(output.shape)

Note: For transposed packed modules, we are generating offset and mask with pointwise convolution as we haven't found a better way to do it.

Do check the examples folder, maybe you can find something helpful.

Acknowledgements

This for fun project is directly modified and extended from torchvision.ops.deform_conv2d.

Citation

@software{hoang_nhat_tran_2025_14699342,
  author       = {Hoang-Nhat Tran and
                  /},
  title        = {inspiros/tvdcn: v1.0.0},
  month        = jan,
  year         = 2025,
  publisher    = {Zenodo},
  version      = {v1.0.0},
  doi          = {10.5281/zenodo.14699342},
  url          = {https://doi.org/10.5281/zenodo.14699342},
  swhid        = {swh:1:dir:a60bb533b28fa3e84241f7cf2bda1cdb084f9572
                   ;origin=https://doi.org/10.5281/zenodo.14699341;vi
                   sit=swh:1:snp:be2c4dc4b3857e7294684a032516ea03c50f
                   a170;anchor=swh:1:rel:fd575bdd9b90aef2c0e339eea6d3
                   d384112654e5;path=inspiros-tvdcn-4a03dfc
                  },
}

License

The code is released under the MIT license. See LICENSE.txt for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tvdcn-1.1.0.tar.gz (83.5 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

tvdcn-1.1.0-cp313-cp313-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.13Windows x86-64

tvdcn-1.1.0-cp313-cp313-manylinux_2_34_x86_64.whl (40.3 MB view details)

Uploaded CPython 3.13manylinux: glibc 2.34+ x86-64

tvdcn-1.1.0-cp313-cp313-macosx_10_13_universal2.whl (1.1 MB view details)

Uploaded CPython 3.13macOS 10.13+ universal2 (ARM64, x86-64)

tvdcn-1.1.0-cp312-cp312-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.12Windows x86-64

tvdcn-1.1.0-cp312-cp312-manylinux_2_34_x86_64.whl (40.3 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

tvdcn-1.1.0-cp312-cp312-macosx_10_13_universal2.whl (1.1 MB view details)

Uploaded CPython 3.12macOS 10.13+ universal2 (ARM64, x86-64)

tvdcn-1.1.0-cp311-cp311-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.11Windows x86-64

tvdcn-1.1.0-cp311-cp311-manylinux_2_34_x86_64.whl (40.3 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.34+ x86-64

tvdcn-1.1.0-cp311-cp311-macosx_10_9_universal2.whl (1.1 MB view details)

Uploaded CPython 3.11macOS 10.9+ universal2 (ARM64, x86-64)

tvdcn-1.1.0-cp310-cp310-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.10Windows x86-64

tvdcn-1.1.0-cp310-cp310-manylinux_2_34_x86_64.whl (40.3 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.34+ x86-64

tvdcn-1.1.0-cp310-cp310-macosx_10_9_universal2.whl (1.1 MB view details)

Uploaded CPython 3.10macOS 10.9+ universal2 (ARM64, x86-64)

tvdcn-1.1.0-cp39-cp39-win_amd64.whl (11.1 MB view details)

Uploaded CPython 3.9Windows x86-64

tvdcn-1.1.0-cp39-cp39-manylinux_2_34_x86_64.whl (40.3 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.34+ x86-64

tvdcn-1.1.0-cp39-cp39-macosx_10_9_universal2.whl (1.1 MB view details)

Uploaded CPython 3.9macOS 10.9+ universal2 (ARM64, x86-64)

File details

Details for the file tvdcn-1.1.0.tar.gz.

File metadata

  • Download URL: tvdcn-1.1.0.tar.gz
  • Upload date:
  • Size: 83.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.22

File hashes

Hashes for tvdcn-1.1.0.tar.gz
Algorithm Hash digest
SHA256 7e9d9fd464551cf6ad8abf21ccb99e9f480e4a95b88455a0df9dcbe128207a37
MD5 401857cf170630576c1a5c8a58d3d510
BLAKE2b-256 8717a94adbd9da9b0a305d57e3ab1887155561e68182e71358d6e9ad6bae582f

See more details on using hashes here.

File details

Details for the file tvdcn-1.1.0-cp313-cp313-win_amd64.whl.

File metadata

  • Download URL: tvdcn-1.1.0-cp313-cp313-win_amd64.whl
  • Upload date:
  • Size: 11.1 MB
  • Tags: CPython 3.13, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.22

File hashes

Hashes for tvdcn-1.1.0-cp313-cp313-win_amd64.whl
Algorithm Hash digest
SHA256 f5b5ff96ac1717bd156abbac6e38bb90966c6cada5704b81508535a8e0a56052
MD5 2cae6e1d4523d4811ccdc0b41818e0f0
BLAKE2b-256 4cf82ddeb5ef0a4c613cee782be01edeef4d68062e9f3f2e4f6616222e458909

See more details on using hashes here.

File details

Details for the file tvdcn-1.1.0-cp313-cp313-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for tvdcn-1.1.0-cp313-cp313-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 4344305c05569a4af9a3d9b7b32837ccf3be6789a1e44a49eff0ba04f4060e88
MD5 1294e09646640a94173fccf4d521a1de
BLAKE2b-256 9806858ec25a20ce123a751f4ebe7df40662c4537f9e90d03fe24f2aba0bd8b2

See more details on using hashes here.

File details

Details for the file tvdcn-1.1.0-cp313-cp313-macosx_10_13_universal2.whl.

File metadata

File hashes

Hashes for tvdcn-1.1.0-cp313-cp313-macosx_10_13_universal2.whl
Algorithm Hash digest
SHA256 934a96304897428c1619c70fddbbca843e542ef9ee3d344e0880d529bb29102b
MD5 3ee99cc904e68aa507d372698e17e158
BLAKE2b-256 dc05aecc9797719076485f227c18cb963604a186c7488868ceac18cacc553b7f

See more details on using hashes here.

File details

Details for the file tvdcn-1.1.0-cp312-cp312-win_amd64.whl.

File metadata

  • Download URL: tvdcn-1.1.0-cp312-cp312-win_amd64.whl
  • Upload date:
  • Size: 11.1 MB
  • Tags: CPython 3.12, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.22

File hashes

Hashes for tvdcn-1.1.0-cp312-cp312-win_amd64.whl
Algorithm Hash digest
SHA256 ad64b88f52c26215721888e0dcab87b088ee672f9582070fb6c13f1d7ad01728
MD5 7c3766dda080ab8c469c37a1a3359c6c
BLAKE2b-256 20ee08c23d9f2a5c332016c7d6f859a11d48f5b0711786a3b84faaef63dc1abe

See more details on using hashes here.

File details

Details for the file tvdcn-1.1.0-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for tvdcn-1.1.0-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 f83783ba36285d0d6ee0ff2e17e33533a95859859317d20ab7f8e656cc940470
MD5 e6deb980a33e9b6349af2b746b523a10
BLAKE2b-256 f6784708fc78a4d441895339e9eceb8fcb7442ec37cc614a1f804c3c609f3fc2

See more details on using hashes here.

File details

Details for the file tvdcn-1.1.0-cp312-cp312-macosx_10_13_universal2.whl.

File metadata

File hashes

Hashes for tvdcn-1.1.0-cp312-cp312-macosx_10_13_universal2.whl
Algorithm Hash digest
SHA256 0d9e1b963be51f4390653c2715aba4ffabffbb32fc68f8607f5a6b1d14e74198
MD5 8c78f8166936e96217cc532649d0adc9
BLAKE2b-256 924f3923ff15384d9b12babd3f88a4e5b23498e757d31782f47ded5acc6eaf3b

See more details on using hashes here.

File details

Details for the file tvdcn-1.1.0-cp311-cp311-win_amd64.whl.

File metadata

  • Download URL: tvdcn-1.1.0-cp311-cp311-win_amd64.whl
  • Upload date:
  • Size: 11.1 MB
  • Tags: CPython 3.11, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.22

File hashes

Hashes for tvdcn-1.1.0-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 879c0d8c40d93b9780eb452f688d613f25dfcbbd171f72676d987e6435f6ed46
MD5 868f785f34532bcc4f0c934a8fa777eb
BLAKE2b-256 84112dd6288a11cf48ac4e5d73495f3b4a6165c89f5bf965f7f6774346d814d3

See more details on using hashes here.

File details

Details for the file tvdcn-1.1.0-cp311-cp311-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for tvdcn-1.1.0-cp311-cp311-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 05bd09eab81730d0f1067ca6d1230a4e96f906c0084558f589108412148b3f53
MD5 32f56ae90460b090d32b5e6b5b953b3b
BLAKE2b-256 3e6b6690c05065c6a26e87a4a794cd6c840aa36c3dc4cb0c38b49c6f71cdf8db

See more details on using hashes here.

File details

Details for the file tvdcn-1.1.0-cp311-cp311-macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for tvdcn-1.1.0-cp311-cp311-macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 2cc47b9f367f84ec620cf8aa384376acd9ec7f3e745729853f4dfe5388539c0c
MD5 d616cbc5a3bc9b4f5ab84d3148f75ef3
BLAKE2b-256 519bc91975aced85b418551b10f66e974a180c6b360bb5d5129782238f8a69ab

See more details on using hashes here.

File details

Details for the file tvdcn-1.1.0-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: tvdcn-1.1.0-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 11.1 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.22

File hashes

Hashes for tvdcn-1.1.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 fec4fd617b61afa13c7a7e803c1a356f097b86994fae70df32c034bb9736b892
MD5 3c6eb073f52e32af3348126cffc48fd4
BLAKE2b-256 110846ee43db668a0344195f33bc4b209daa387054f9dc350d104c84e4f8c9ee

See more details on using hashes here.

File details

Details for the file tvdcn-1.1.0-cp310-cp310-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for tvdcn-1.1.0-cp310-cp310-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 3f9808e421a8868cabd74fae767fdd78bae7b445d3dcca116581808b5872a278
MD5 f4ea1e9311f74e8e31a28ad9a9164838
BLAKE2b-256 810f36b342bd85705cf069df45cb916ae6e172471422c66cb5c912671fc9e867

See more details on using hashes here.

File details

Details for the file tvdcn-1.1.0-cp310-cp310-macosx_10_9_universal2.whl.

File metadata

File hashes

Hashes for tvdcn-1.1.0-cp310-cp310-macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 3870d2edd9906f9e818cccf653dffef40cbf9baed80194610980ebe3c782d2d0
MD5 a10b1bf2d8dcfcfd58b91d1bbc20a317
BLAKE2b-256 86a8e7f556d9b1ff333750d0960b4212adecdc9e9e08977de63ed91900983a93

See more details on using hashes here.

File details

Details for the file tvdcn-1.1.0-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: tvdcn-1.1.0-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 11.1 MB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.22

File hashes

Hashes for tvdcn-1.1.0-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 69d708c82ee3a3c8921eb5d3ea853f5fff5e9eb06a9c4bab20c74c93d262726b
MD5 28ad8f182ca76742562958e68e025581
BLAKE2b-256 042ece97351ebbef5d685aec4afa5c4ba1bdd21ca49dc151303b826c93c0c950

See more details on using hashes here.

File details

Details for the file tvdcn-1.1.0-cp39-cp39-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for tvdcn-1.1.0-cp39-cp39-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 d466414a7264c7fdca15c313cba99eb12bac34862f7c18716b63c775c4d21182
MD5 732e83dc4e4f195e400fbc8262fdfc56
BLAKE2b-256 d9968143cceae3115a5230bac32cd619e57430c91d3b78235cfe1f82603ae2d9

See more details on using hashes here.

File details

Details for the file tvdcn-1.1.0-cp39-cp39-macosx_10_9_universal2.whl.

File metadata

  • Download URL: tvdcn-1.1.0-cp39-cp39-macosx_10_9_universal2.whl
  • Upload date:
  • Size: 1.1 MB
  • Tags: CPython 3.9, macOS 10.9+ universal2 (ARM64, x86-64)
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.22

File hashes

Hashes for tvdcn-1.1.0-cp39-cp39-macosx_10_9_universal2.whl
Algorithm Hash digest
SHA256 827909c07be0178fe3e3dc910f0ffde91c027e280175edfb3cae39c71f0c45e7
MD5 6a6466611476f60a18123af719b4330b
BLAKE2b-256 154d91e7062ec0ef5dca7ecfa1e104471f1c192df615905eeb67253092963b6b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page