Skip to main content

Frequency-domain native image processing -- skip the pixel decode.

Project description

DCT-Vision

Frequency-domain native image processing. Operates directly on JPEG DCT coefficients, skipping the pixel decode step entirely.

Why?

JPEG images are already stored as DCT coefficients. Decoding to pixels just to blur/sharpen/adjust is wasteful. DCT-Vision works directly on the coefficients -- many operations become simple multiplications instead of expensive convolutions.

Performance

Benchmarked on 1024x1024 JPEG (operation time only, image already loaded):

Operation DCT-Vision Pillow OpenCV vs Pillow vs OpenCV
Blur 2.0ms 21.1ms 1.2ms 10.5x 0.6x
Sharpen 1.9ms 19.3ms 2.9ms 10.1x 1.5x
Brightness 0.2ms 5.3ms 5.6ms 26.5x 28.0x
Contrast 0.6ms 14.5ms 15.4ms 24.2x 25.7x
Noise 13.6ms 58.4ms 53.5ms 4.3x 3.9x
Edge detect 0.8ms 11.6ms 0.9ms 14.5x 1.1x

Full pipeline (load + flip + brightness + noise + save, 1024x1024):

  • DCT-Vision: 83ms | Pillow: 117ms | OpenCV: 107ms

Install

pip install dct-vision

Quick start

Python API

from dct_vision.core.dct_image import DCTImage
from dct_vision.ops.blur import blur
from dct_vision.ops.color import adjust_brightness

# Load JPEG (extracts DCT coefficients directly, no pixel decode)
img = DCTImage.from_file("photo.jpg")

# Process in frequency domain
img = blur(img, sigma=2.0)
img = adjust_brightness(img, offset=20)

# Save (writes coefficients directly, no pixel encode)
img.save("output.jpg")

CLI

dv blur photo.jpg -o blurred.jpg --sigma 2.0
dv sharpen photo.jpg -o sharp.jpg --amount 1.5
dv brightness photo.jpg -o bright.jpg --offset 30
dv contrast photo.jpg -o contrast.jpg --factor 1.5
dv downscale photo.jpg -o small.jpg --factor 2
dv edges photo.jpg -o edges.jpg --method laplacian
dv info photo.jpg --json
dv quality photo.jpg
dv convert input.png -o output.jpg --quality 85
dv augment photo.jpg -o aug.jpg --flip horizontal --noise 3.0 --seed 42

ML Augmentation Pipeline

from dct_vision.core.dct_image import DCTImage
from dct_vision.augment.flip import horizontal_flip
from dct_vision.augment.jitter import brightness_jitter
from dct_vision.augment.noise import gaussian_noise

img = DCTImage.from_file("train/img_001.jpg")
img = horizontal_flip(img)
img = brightness_jitter(img, max_offset=20, seed=42)
img = gaussian_noise(img, sigma=2.0, seed=42)
img.save("augmented/img_001.jpg")

Operations

Operation Type How it works
Gaussian blur Tier 1 Multiply coefficients by Gaussian envelope
Sharpening Tier 1 Boost high-frequency coefficients
Brightness Tier 1 Offset DC coefficient (block mean)
Contrast Tier 1 Scale AC coefficients (deviation from mean)
Downscale 2x Tier 1 Merge 2x2 block groups via transform matrix
Edge detection Tier 2 Laplacian or gradient in frequency domain
Quality estimation Tier 1 Reverse-engineer quality from quant tables
Horizontal/vertical flip Augment Negate odd-indexed frequency coefficients
Block crop Augment Slice coefficient array directly
Brightness/contrast jitter Augment Random DC/AC perturbation
Gaussian noise Augment Add noise to AC coefficients

Documentation

Requirements

  • Python 3.10+
  • libjpeg-turbo (for native DCT extraction; falls back to Pillow if unavailable)

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dct_vision-0.1.0.tar.gz (827.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dct_vision-0.1.0-py3-none-any.whl (39.7 kB view details)

Uploaded Python 3

File details

Details for the file dct_vision-0.1.0.tar.gz.

File metadata

  • Download URL: dct_vision-0.1.0.tar.gz
  • Upload date:
  • Size: 827.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dct_vision-0.1.0.tar.gz
Algorithm Hash digest
SHA256 2c8e28853f348c18ed25725b16511c094cf73606aa9b4ae2f5b74d19082a801b
MD5 08275e6dc8cd26cfcf7775f94cac340b
BLAKE2b-256 1245fbf720cfcb39fdaddfeddcaefdc9fbbb8b8bb654671145b28c0c00483add

See more details on using hashes here.

Provenance

The following attestation bundles were made for dct_vision-0.1.0.tar.gz:

Publisher: publish.yml on Athlitix/DCT-Vision

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dct_vision-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: dct_vision-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 39.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for dct_vision-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b9f8e9ea2033203db41cabe3a832fc4bc89471bbbcfff24aa6de3e893c4de520
MD5 984f64903abb56e77f096cf54cc87ab0
BLAKE2b-256 ec76f5c06b32546f778ea88e35bea6b9a44973d9acc50a6b2d66665264e53341

See more details on using hashes here.

Provenance

The following attestation bundles were made for dct_vision-0.1.0-py3-none-any.whl:

Publisher: publish.yml on Athlitix/DCT-Vision

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page