Skip to main content

Train models for native camera formats supported by edge platforms

Project description

EdgeFirst CameraAdaptor

Train deep learning models that consume camera formats natively supported by target edge platforms, avoiding costly runtime conversions.

Why CameraAdaptor?

When deploying computer vision models to edge devices, there's often a mismatch between:

  1. Training data format: RGB images from standard datasets (ImageNet, COCO, etc.)
  2. Inference input format: Native camera/hardware formats (YUV, Bayer, BGR, RGBA)

Traditional approach: Convert camera output → RGB → Model inference

Problem: This conversion requires hardware (ISP, GPU, 2D accelerator) and adds latency.

Solution: Train the model to expect the native camera format directly.

How It Works

graph LR
  subgraph training ["Training Time"]
    direction LR
    A1["RGB Dataset"] --> B1["CameraAdaptorTransform"] --> C1["Target Format<br>(e.g., YUYV, BGR)"] --> D1["Model with<br>CameraAdaptor"]
  end

  subgraph inference ["Inference Time"]
    direction LR
    A2["Camera/<br>Hardware"] --> D2["Model<br>(native format)"]
  end

  style training fill:#e3f2fd,stroke:#1976d2
  style inference fill:#e8f5e9,stroke:#4caf50

Key Components

Component Purpose
CameraAdaptorTransform Preprocessing: Convert RGB training data to target format
CameraAdaptor (PyTorch) Model layer: Handle format-specific input processing
CameraAdaptor (TensorFlow) Model layer: Handle format-specific input processing
CameraAdaptorConfig Configuration and metadata for model export

Important Design Note

The CameraAdaptor layer does NOT perform color space conversion. Color conversion is handled by CameraAdaptorTransform during training data loading:

  • Training: CameraAdaptorTransform converts RGB images → target format (e.g., YUYV, BGR)
  • Inference: Camera/ISP provides data directly in target format → no conversion needed

The CameraAdaptor layer only performs:

  1. Layout permutation (NHWC ↔ NCHW) when channels_last/channels_first is enabled
  2. Alpha channel dropping for RGBA/BGRA inputs

EdgeFirst Ecosystem

EdgeFirst CameraAdaptor is part of the EdgeFirst AI ecosystem:

  • EdgeFirst HAL: Runtime library with optimized pre-processing pipelines for edge deployment. Use HAL for on-target inference and benchmarking of models trained with CameraAdaptor.
  • EdgeFirst CameraAdaptor: Training library (this project) for creating models that accept native camera formats.

On-target benchmarks use edgefirst-hal to benchmark pre-processing pipelines with various CameraAdaptor configurations.

Installation

# Core library (numpy only)
pip install edgefirst-cameraadaptor

# With preprocessing support (OpenCV)
pip install edgefirst-cameraadaptor[transform]

# With PyTorch support
pip install edgefirst-cameraadaptor[torch]

# With TensorFlow support
pip install edgefirst-cameraadaptor[tensorflow]

# With PyTorch Lightning support
pip install edgefirst-cameraadaptor[lightning]

# Everything
pip install edgefirst-cameraadaptor[all]

Quick Start

Preprocessing Transform

Convert training images to your target camera format:

from edgefirst.cameraadaptor import CameraAdaptorTransform

# Create transform for BGR format (RGB source by default)
transform = CameraAdaptorTransform("bgr")
bgr_frame = transform(rgb_frame)

# If using OpenCV's default BGR loading
transform = CameraAdaptorTransform("yuyv", source_format="bgr")
yuyv_frame = transform(bgr_frame)  # cv2.imread() returns BGR

PyTorch Model

Add the adaptor as the first layer of your model:

from edgefirst.cameraadaptor.pytorch import CameraAdaptor
import torch.nn as nn

class MyModel(nn.Module):
    def __init__(self, adaptor="rgb"):
        super().__init__()
        self.adaptor = CameraAdaptor(adaptor)
        self.backbone = nn.Sequential(
            nn.Conv2d(CameraAdaptor.compute_output_channels(adaptor), 64, 3),
            # ... rest of your model
        )

    def forward(self, x):
        x = self.adaptor(x)
        return self.backbone(x)

# Model for RGBA input (4 channels -> 3 channels after adaptor)
model = MyModel(adaptor="rgba")

TensorFlow/Keras Model

from edgefirst.cameraadaptor.tensorflow import CameraAdaptor
import tensorflow as tf

inputs = tf.keras.Input(shape=(224, 224, 4))  # RGBA input
x = CameraAdaptor("rgba")(inputs)  # Drops alpha -> 3 channels
x = tf.keras.layers.Conv2D(64, 3, padding="same")(x)
# ... rest of your model

Channels-Last Input (Camera Pipeline Direct)

For models receiving data directly from camera pipelines in NHWC format:

# PyTorch: accept channels-last input, convert to channels-first internally
adaptor = CameraAdaptor("yuyv", channels_last=True)
x = torch.randn(1, 224, 224, 2)  # NHWC from camera
y = adaptor(x)  # Output: (1, 2, 224, 224) in NCHW

# TensorFlow: accept channels-first input if needed
from edgefirst.cameraadaptor.tensorflow import CameraAdaptor
layer = CameraAdaptor("yuyv", channels_first=True)
x = tf.random.normal((1, 2, 224, 224))  # NCHW
y = layer(x)  # Output: (1, 224, 224, 2) in NHWC

Ultralytics YAML Configuration

# YOLOv8 model with RGBA input
backbone:
  - [-1, 1, CameraAdaptor, [rgba]]  # First layer
  - [-1, 1, Conv, [64, 3, 2]]
  # ... rest of backbone

Source Format (Data Loader Compatibility)

Different image loading libraries return different formats:

Library Default Format Transform Setup
PIL/Pillow RGB source_format="rgb" (default)
torchvision RGB source_format="rgb" (default)
OpenCV cv2.imread() BGR source_format="bgr"
OpenCV cv2.IMREAD_UNCHANGED BGRA source_format="bgra"
imageio RGB source_format="rgb" (default)
skimage RGB source_format="rgb" (default)

Important: OpenCV loads images as BGR by default. If you're using cv2.imread() without explicit conversion, set source_format="bgr":

import cv2
from edgefirst.cameraadaptor import CameraAdaptorTransform

# CORRECT: Tell the transform your source is BGR
img = cv2.imread("image.jpg")
transform = CameraAdaptorTransform("yuyv", source_format="bgr")
yuyv = transform(img)

Supported Color Spaces

Currently Supported

Format Input Channels Output Channels Description
RGB 3 3 Standard RGB
BGR 3 3 OpenCV native
RGBA 4 3 RGB + alpha (dropped)
BGRA 4 3 BGR + alpha (dropped)
YUYV 2 2 YUV 4:2:2, ch0=Y, ch1=UV

Planned

  • Roadmap: NV12, NV21 (semi-planar YUV 4:2:0)
  • Roadmap: Bayer patterns (RGGB, BGGR, GRBG, GBRG)

See FORMATS.md for detailed format documentation.

Platform-Specific Guidance

See PLATFORMS.md for i.MX platform-specific recommendations:

  • i.MX 93: PXP outputs BGR - train models with BGR format
  • i.MX 8M Plus: G2D outputs RGBA - use RGBA to auto-slice alpha
  • i.MX 95: ISI/ISP pipeline considerations

Configuration

Use CameraAdaptorConfig for model metadata:

from edgefirst.cameraadaptor import CameraAdaptorConfig

config = CameraAdaptorConfig(
    adaptor="yuyv",
    input_dtype="uint8",   # For quantized models
    output_dtype="uint8",
)

# Embed in model metadata
metadata = config.to_metadata()

PyTorch Lightning Integration

from pytorch_lightning import Trainer
from edgefirst.cameraadaptor.pytorch.lightning import create_callback

callback = create_callback("yuyv")
trainer = Trainer(callbacks=[callback])

Migration from Existing Code

From ultralytics/edgefirst

# Before
from ultralytics.edgefirst.camera.adaptor import CameraAdaptorTransform
from ultralytics.edgefirst.nn.modules import CameraAdaptor

# After
from edgefirst.cameraadaptor import CameraAdaptorTransform
from edgefirst.cameraadaptor.pytorch import CameraAdaptor

From modelpack

# Before
from deepview.modelpack.datasets.color import ColorAdaptor
from deepview.modelpack.layers.conv2d import ColorAdaptor as TFColorAdaptor

# After
from edgefirst.cameraadaptor import CameraAdaptorTransform
from edgefirst.cameraadaptor.tensorflow import CameraAdaptor

API Reference

CameraAdaptorTransform

Preprocessing transform for converting images to target formats.

transform = CameraAdaptorTransform(
    adaptor="yuyv",           # Target format
    source_format="rgb",      # Source format (default: "rgb")
)
output = transform(image)  # or transform.convert(image)

Parameters:

  • adaptor: Target color space (str or ColorSpace enum)
  • source_format: Source color space from data loader (str or ColorSpace enum, default: "rgb")

Properties:

  • adaptor: Target adaptor name (str)
  • source_format: Source format name (str)
  • channels: Output channel count
  • input_channels: Source format channel count
  • output_channels: Channels model backbone receives

CameraAdaptor (PyTorch)

from edgefirst.cameraadaptor.pytorch import CameraAdaptor

adaptor = CameraAdaptor(
    adaptor="yuyv",           # Target format
    channels_last=False,      # True for NHWC input
)
output = adaptor(input_tensor)

Parameters:

  • adaptor: Target color space (str or ColorSpace enum)
  • channels_last: If True, input is NHWC, permuted to NCHW (default: False)

Static Methods:

  • compute_input_channels(args): Get input channels from YAML args
  • compute_output_channels(args): Get output channels from YAML args

CameraAdaptor (TensorFlow)

from edgefirst.cameraadaptor.tensorflow import CameraAdaptor

layer = CameraAdaptor(
    adaptor="yuyv",           # Target format (None for auto-detect)
    channels_first=False,     # True for NCHW input
)
output = layer(input_tensor)

Parameters:

  • adaptor: Target color space (str, None for auto-detect)
  • channels_first: If True, input is NCHW, permuted to NHWC (default: False)

CameraAdaptorConfig

from edgefirst.cameraadaptor import CameraAdaptorConfig

config = CameraAdaptorConfig(
    adaptor="yuyv",
    input_dtype="float32",
    output_dtype="float32",
)

Properties:

  • input_channels: Input channel count
  • output_channels: Output channel count
  • is_quantized: Whether config uses quantized dtypes

Methods:

  • to_dict(): Convert to dictionary
  • to_metadata(): Convert to model metadata format
  • from_dict(data): Create from dictionary
  • from_metadata(metadata): Create from model metadata

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

edgefirst_cameraadaptor-0.1.1.tar.gz (34.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

edgefirst_cameraadaptor-0.1.1-py3-none-any.whl (28.2 kB view details)

Uploaded Python 3

File details

Details for the file edgefirst_cameraadaptor-0.1.1.tar.gz.

File metadata

  • Download URL: edgefirst_cameraadaptor-0.1.1.tar.gz
  • Upload date:
  • Size: 34.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for edgefirst_cameraadaptor-0.1.1.tar.gz
Algorithm Hash digest
SHA256 3d32fe4cfc2950dc80fac33bc1d30ce09e251ea224df6eb117688b28d8bb0d06
MD5 8563367f99d4d69fb9a247a096e967ea
BLAKE2b-256 7192f7ee1ac2acace03dbdc817b9c64866baeccf997a4972ebbf314049a36ba7

See more details on using hashes here.

Provenance

The following attestation bundles were made for edgefirst_cameraadaptor-0.1.1.tar.gz:

Publisher: pypi.yml on EdgeFirstAI/cameraadaptor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file edgefirst_cameraadaptor-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for edgefirst_cameraadaptor-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4f2245b10cacc947c32cbe18ac5203c8ea41143c5af36e2a9b83645ab16028e5
MD5 aedbba9da454f924f3b237f8de4d0c6e
BLAKE2b-256 ec01621c5efde3edaafe13ddf8525bfebe8fb1c5a20aed4101ea0344c3f33966

See more details on using hashes here.

Provenance

The following attestation bundles were made for edgefirst_cameraadaptor-0.1.1-py3-none-any.whl:

Publisher: pypi.yml on EdgeFirstAI/cameraadaptor

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page