Fast differentiable resizing and warping of arbitrary grids
Project description
Resampler Notebook
Hugues Hoppe Aug 2022.
[Open in Colab] [in Kaggle] [in MyBinder] [in DeepNote] [GitHub source] [API docs] [PyPI package]
This Python notebook has several roles:
- Source code for the
resampler
library. - Illustrated documentation.
- Usage examples.
- Unit tests.
- Signal-processing experiments to justify choices.
- Lint, build, and export the package and its documentation.
Overview of resampler library
resampler
enables fast differentiable resizing and warping of arbitrary grids.
It supports:
-
grids of arbitrary dimension (e.g., 1D audio, 2D images, 3D video, 4D batches of videos), containing
-
sample values of arbitrary shape (e.g., scalars, RGB colors, motion vectors, Jacobian matrices) and
-
arbitrary numeric type (integer, floating, and complex);
-
either
dual
("half-integer") orprimal
grid-type for each dimension; -
many boundary rules, specified per dimension, extensible via subclassing;
-
an extensible set of parameterized filter kernels, selectable per dimension;
-
optional gamma transfer functions for correct linear-space filtering;
-
prefiltering for accurate antialiasing when downsampling;
-
processing within several array libraries (
numpy
,tensorflow
, andtorch
); -
efficient backpropagation of gradients for both
tensorflow
andtorch
; -
easy installation, without any native-code extension module, yet
-
faster resizing than the C++ implementations in
tf.image
,torch.nn
, andtorchvision
.
Example usage
!pip install -q mediapy resampler
import mediapy as media
import numpy as np
import resampler
array = np.random.rand(4, 4, 3) # 4x4 RGB image.
upsampled = resampler.resize(array, (128, 128)) # To 128x128 resolution.
media.show_images({'4x4': array, '128x128': upsampled}, height=128)
image = media.read_image('https://github.com/hhoppe/data/raw/main/image.png')
downsampled = resampler.resize(image, (32, 32))
media.show_images({'128x128': image, '32x32': downsampled}, height=128)
import matplotlib.pyplot as plt
array = [3.0, 5.0, 8.0, 7.0]
new_dual = resampler.resize(array, (32,)) # (default gridtype='dual') 8x resolution.
new_primal = resampler.resize(array, (25,), gridtype='primal') # 8x resolution.
_, axs = plt.subplots(1, 2, figsize=(7, 1.5))
axs[0].set_title('gridtype dual')
axs[0].plot((np.arange(len(array)) + 0.5) / len(array), array, 'o')
axs[0].plot((np.arange(len(new_dual)) + 0.5) / len(new_dual), new_dual, '.')
axs[1].set_title('gridtype primal')
axs[1].plot(np.arange(len(array)) / (len(array) - 1), array, 'o')
axs[1].plot(np.arange(len(new_primal)) / (len(new_primal) - 1), new_primal, '.')
batch_size = 4
batch_of_images = media.moving_circle((16, 16), batch_size)
spacer = np.ones((64, 16, 3))
upsampled = resampler.resize(batch_of_images, (batch_size, 64, 64))
media.show_images([*batch_of_images, spacer, *upsampled], border=True, height=64)
media.show_videos({'original': batch_of_images, 'upsampled': upsampled}, fps=1)
original upsampled
Most examples above use the default
resize()
settings:
gridtype='dual'
for both source and destination arrays,boundary='auto'
which uses'reflect'
for upsampling and'clamp'
for downsampling,filter='lanczos3'
(a Lanczos kernel with radius 3),gamma=None
which by default uses the'power2'
transfer function for theuint8
image in the second example,scale=1.0, translate=0.0
(no domain transformation),- default
precision
and outputdtype
.
Advanced usage:
Map an image to a wider grid using custom scale
and translate
vectors,
with horizontal 'reflect'
and vertical 'natural'
boundary rules,
providing a constant value for the exterior,
using different filters (Lanczos and O-MOMS) in the two dimensions,
disabling gamma correction, performing computations in double-precision,
and returning an output array in single-precision:
new = resampler.resize(
image, (128, 512), boundary=('natural', 'reflect'), cval=(0.2, 0.7, 0.3),
filter=('lanczos3', 'omoms5'), gamma='identity', scale=(0.8, 0.25),
translate=(0.1, 0.35), precision='float64', dtype='float32')
media.show_images({'image': image, 'new': new})
Warp an image by transforming it using polar coordinates:
shape = image.shape[:2]
yx = ((np.indices(shape).T + 0.5) / shape - 0.5).T # [-0.5, 0.5]^2
radius, angle = np.linalg.norm(yx, axis=0), np.arctan2(*yx)
angle += (0.8 - radius).clip(0, 1) * 2.0 - 0.6
coords = np.dstack((np.sin(angle) * radius, np.cos(angle) * radius)) + 0.5
resampled = resampler.resample(image, coords, boundary='constant')
media.show_images({'image': image, 'resampled': resampled})
Limitations:
- Filters are assumed to be separable. For rotation equivariance (e.g., bandlimit the signal uniformly in all directions), it would be nice to support the (non-separable) 2D rotationally symmetric sombrero function $f(\textbf{x}) = \text{jinc}(|\textbf{x}|)$, where $\text{jinc}(r) = 2J_1(\pi r)/(\pi r)$. (The Fourier transform of a circle involves the first-order Bessel function of the first kind.)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for resampler-0.3.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0d97c7c9c68152596d83c63eaee2e2802eceb0e25c0c1a67d24b1d60f3a3440c |
|
MD5 | bb9672f67addfb01c80883c4dd97d1af |
|
BLAKE2b-256 | a58241fc8ad2346c55e949c6dd07f1363073657ba09f53e118921ad5efe1c6c0 |