Skip to main content

Frequency-domain native image processing -- skip the pixel decode.

Project description

DCT-Vision

Frequency-domain native image processing. Operates directly on JPEG DCT coefficients, skipping the pixel decode step entirely.

Why?

JPEG images are already stored as DCT coefficients. Decoding to pixels just to blur/sharpen/adjust is wasteful. DCT-Vision works directly on the coefficients -- many operations become simple multiplications instead of expensive convolutions.

Performance

Benchmarked on 1024x1024 JPEG (operation time only, image already loaded):

Operation DCT-Vision Pillow OpenCV vs Pillow vs OpenCV
Blur 2.0ms 21.1ms 1.2ms 10.5x 0.6x
Sharpen 1.9ms 19.3ms 2.9ms 10.1x 1.5x
Brightness 0.2ms 5.3ms 5.6ms 26.5x 28.0x
Contrast 0.6ms 14.5ms 15.4ms 24.2x 25.7x
Noise 13.6ms 58.4ms 53.5ms 4.3x 3.9x
Edge detect 0.8ms 11.6ms 0.9ms 14.5x 1.1x

Full pipeline (load + flip + brightness + noise + save, 1024x1024):

  • DCT-Vision: 83ms | Pillow: 117ms | OpenCV: 107ms

Install

pip install dct-vision

Quick start

Python API

from dct_vision.core.dct_image import DCTImage
from dct_vision.ops.blur import blur
from dct_vision.ops.color import adjust_brightness

# Load JPEG (extracts DCT coefficients directly, no pixel decode)
img = DCTImage.from_file("photo.jpg")

# Process in frequency domain
img = blur(img, sigma=2.0)
img = adjust_brightness(img, offset=20)

# Save (writes coefficients directly, no pixel encode)
img.save("output.jpg")

CLI

dv blur photo.jpg -o blurred.jpg --sigma 2.0
dv sharpen photo.jpg -o sharp.jpg --amount 1.5
dv brightness photo.jpg -o bright.jpg --offset 30
dv contrast photo.jpg -o contrast.jpg --factor 1.5
dv downscale photo.jpg -o small.jpg --factor 2
dv edges photo.jpg -o edges.jpg --method laplacian
dv info photo.jpg --json
dv quality photo.jpg
dv convert input.png -o output.jpg --quality 85
dv augment photo.jpg -o aug.jpg --flip horizontal --noise 3.0 --seed 42

ML Augmentation Pipeline

from dct_vision.core.dct_image import DCTImage
from dct_vision.augment.flip import horizontal_flip
from dct_vision.augment.jitter import brightness_jitter
from dct_vision.augment.noise import gaussian_noise

img = DCTImage.from_file("train/img_001.jpg")
img = horizontal_flip(img)
img = brightness_jitter(img, max_offset=20, seed=42)
img = gaussian_noise(img, sigma=2.0, seed=42)
img.save("augmented/img_001.jpg")

Operations

Operation Type How it works
Gaussian blur Tier 1/2 Multiply coefficients by Gaussian envelope (cross-block for sigma > 2)
Sharpening Tier 1 Boost high-frequency coefficients
Brightness Tier 1 Offset DC coefficient (block mean)
Contrast Tier 1 Scale AC coefficients (deviation from mean)
Downscale 2x Tier 1 Merge 2x2 block groups via transform matrix
Edge detection Tier 2 Laplacian or gradient in frequency domain
Sobel edge detection Tier 1 Directional frequency gradient weights
Scharr edge detection Tier 1 Weighted directional gradient (more accurate)
Box blur Tier 1 Sinc-like frequency envelope
Emboss Tier 1 Directional frequency emphasis
Band-pass filter Tier 1 Keep mid-frequency coefficients (no OpenCV equivalent)
Unsharp mask Tier 1 1 + amount * (1 - Gaussian envelope)
Color temperature Tier 1 Shift Cb/Cr DC coefficients
Saturation Tier 1 Scale Cb/Cr coefficients
Wiener denoising Tier 1 Optimal frequency-domain noise filter
JPEG deblocking Tier 1 Attenuate high-freq quantization artifacts
Perceptual hash (pHash) Tier 1 Hash from DC coefficients (native DCT advantage)
Blur detection Analysis High-freq to total energy ratio
Noise estimation Analysis Std of highest-frequency coefficients
Texture complexity Analysis Nonzero AC coefficient ratio
Image similarity Analysis Normalized cross-correlation of coefficients
Vignette Photo Distance-weighted block attenuation
Sepia / tint Photo Set Cb/Cr to fixed warm values
Grayscale conversion Photo Drop Cb/Cr channels (zero cost)
Posterize Photo Aggressive coefficient requantization
Solarize Photo Invert coefficients above threshold
Requantize (change JPEG quality) Compression Apply new quant table without decode
Coefficient pruning Compression Zero small AC coefficients to reduce file size
Quality estimation Tier 1 Reverse-engineer quality from quant tables
Horizontal/vertical flip Augment Negate odd-indexed frequency coefficients
Block crop Augment Slice coefficient array directly
Brightness/contrast jitter Augment Random DC/AC perturbation
Gaussian noise Augment Add noise to AC coefficients

Documentation

Requirements

  • Python 3.10+
  • libjpeg-turbo (for native DCT extraction; falls back to Pillow if unavailable)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dct_vision-0.2.0.tar.gz (843.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dct_vision-0.2.0-py3-none-any.whl (53.8 kB view details)

Uploaded Python 3

File details

Details for the file dct_vision-0.2.0.tar.gz.

File metadata

  • Download URL: dct_vision-0.2.0.tar.gz
  • Upload date:
  • Size: 843.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dct_vision-0.2.0.tar.gz
Algorithm Hash digest
SHA256 70536462bfc205b3c54e65834ca24bd9419c3c2ce944be729e9ca8e39680c5f6
MD5 b162043571d4a22fb53fda3f65b65804
BLAKE2b-256 974174cfb049ea1d5cb59b0732370b6949443befa9aea99d6a2c1c8d52a6e12c

See more details on using hashes here.

Provenance

The following attestation bundles were made for dct_vision-0.2.0.tar.gz:

Publisher: publish.yml on Athlitix/DCT-Vision

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dct_vision-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: dct_vision-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 53.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dct_vision-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 3ebe3dd59cc23e4ffacd5a97dd84c22f253fda4f01d8475292485c436a1e0465
MD5 e888b3caa41cc128e5a90fc9fa92518b
BLAKE2b-256 e049c926c62fb1a6f8dd098ed623bd2605065dfec590be924020ade5d11ae0c8

See more details on using hashes here.

Provenance

The following attestation bundles were made for dct_vision-0.2.0-py3-none-any.whl:

Publisher: publish.yml on Athlitix/DCT-Vision

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page