Dataloader using Habana hardware media pipeline

Project description

Habana Media Python package

habana_media_loader is a package designed for easy integration of media processing on Gaudi2. Main entry point (Python import) is habana_frameworks.mediapipe module that contains all the necessary functions to work with Gaudi2.

Structure

Properly built wheel contains:

habana_frameworks python namespace (with all the folder structure inside).
mediapipe folder catering media execution on device and medialoader folder catering pre-built mediapipe for pytorch framework.
proper licensing.

Media package (habana_frameworks.mediapipe and habana_frameworks.medialoaders)

First part of media package contains media pipe which is responsible for media processing on device.

Following are the steps to create mediapipe

Create a class derived from habana_frameworks.mediapipe super class.
In the class constructor initialize super class.
Create nodes required for execution along with it's parameters.
Define a method definegraph() which defines the data flow between nodes created in constructor.

Following are the steps to execute a standalone mediapipe

Instantiate an object of defined mediapipe class.
Build the mediapipe by executing build() method of mediapipe object.
Initialize the iterator by calling iter_init() method of mediapipe object.
To produce one batch of dataset, execute run() method of mediapipe object. Each run() method call executes and produces one batch of device tensors.
To view or manipulate tensors on host as_cpu() method of device tensor object can be called, which yields host tensor object.
For numpy manipulation as_nparray() method of host tensor object can be called to get a numpy host array.

Example:

from habana_frameworks.mediapipe import fn
from habana_frameworks.mediapipe.mediapipe import MediaPipe
from habana_frameworks.mediapipe.media_types import imgtype as it
from habana_frameworks.mediapipe.media_types import dtype as dt
from habana_frameworks.mediapipe.media_types import layout as lt
import time

class myMediaPipe(MediaPipe):
    def __init__(self, device, queue_depth, batch_size, channel, height, width):
        super(
            myMediaPipe,
            self).__init__(
            device,
            queue_depth,
            batch_size,
            channel,
            height,
            width,
            self.__class__.__name__,
            layout=lt.NHWC)
        mediapipe_seed = int(time.time_ns() % (2**31 - 1))
        # create reader node and setting it's params
        self.input = fn.ReadImageDatasetFromDir(dir="/path/to/jpeg/dir/",
                                                format="JPEG",
                                                shuffle=True,
                                                seed=mediapipe_seed)
        # create decoder node and set it's params
        self.decode = fn.ImageDecoder(output_format=it.RGB_P,
                                      resize=[224, 224])

        # create transpose node and set it's params
        self.transpose = fn.Transpose(permutation=[2, 0, 1, 3], tensorDim=4)

    def definegraph(self):
        # define actual data flow of nodes
        jpegs, data = self.input()
        images = self.decode(jpegs)
        images = self.transpose(images)
        # return output nodes of the graph
        return images, data


# test specific params
batch_size = 4
img_width = 224
img_height = 224
channels = 3
queue_depth = 3
iterations = 5

# instantiating defined class
pipe = myMediaPipe("hpu", queue_depth, batch_size,
                   channels, img_height, img_width)
# build the pipe
pipe.build()
# initialize iterator
pipe.iter_init()

batch_count = 0
while(batch_count < iterations):
    try:
        # exectute and produce one batch of dataset.
        images, labels = pipe.run()
        # images and labels are device tensors.
    except StopIteration:
        print("stop iteration")
        break
    # as cpu will bring the device data to host and produce host tensors
    # as_nparray will convert host tensors to numpy array.
    images = images.as_cpu().as_nparray()
    labels = labels.as_cpu().as_nparray()
    batch_count = batch_count + 1

Second part of media package contains pre built media pipe for pytorch.

torch folder contains media_dataloader_mediapipe containing `HPUMediaPipe` which can be used to create resnet and SSD media pipe for pytorch

Following are the steps to use HPUMediaPipe for pytorch

Import HPUMediaPipe from habana_frameworks.medialoaders.torch.media_dataloader_mediapipe
Instantiate an object of HPUMediaPipe with following parameters:
- a_torch_transforms: transforms to be applied on mediapipe.
- a_root: directory path from which to load the images.
- a_annotation_file: path from which to load annotation file for SSD.
- a_batch_size: mediapipe output batch size.
- a_shuffle: whether images have to be shuffled. <True/False>
- a_drop_last: whether to drop the last incomplete batch or round up.<True/False>
- a_prefetch_count: queue depth for media processing.
- a_num_instances: number of devices.
- a_instance_id: instance id of current device.
- a_model_ssd: whether mediapipe is to be created for SSD. <True/False>
- a_device: media device to run mediapipe on.
Separate HPUMediaPipe objects can be created for training and validation.
Instantiate an object of HPUResnetPytorchIterator (for resnet) or HPUSsdPytorchIterator (for SSD) with following parameters
- mediapipe: media pipe object.

Example for resnet media pipe:

from habana_frameworks.medialoaders.torch.media_dataloader_mediapipe import HPUMediaPipe
from habana_frameworks.mediapipe.plugins.iterator_pytorch import HPUResnetPytorchIterator

normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])
torch_transforms = transforms.Compose([
            transforms.RandomResizedCrop(224),
            transforms.RandomHorizontalFlip(),
            transforms.ToTensor(),
            normalize,
        ])

root = "/JPEG/path"
batch_size = 256
shuffle = True
drop_last = False
prefetch_factor = 3
num_instances = 1
instance_id = 0

pipeline = HPUMediaPipe(a_torch_transforms=torch_transforms, a_root=root, a_batch_size=batch_size,
                        a_shuffle=shuffle, a_drop_last=drop_last, a_prefetch_count=prefetch_factor,
                        a_num_instances=num_instances, a_instance_id=instance_id, a_device="hpu")

iterator = HPUResnetPytorchIterator(mediapipe=pipeline)

Project details

Release history Release notifications | RSS feed

This version

1.24.0.1007

Apr 16, 2026

1.23.0.695

Jan 8, 2026

1.22.2.32

Nov 29, 2025

1.22.1.6

Sep 17, 2025

1.22.0.740

Sep 4, 2025

1.21.5.6

Sep 17, 2025

1.21.4.3

Aug 14, 2025

1.21.3.57

Jul 25, 2025

1.21.2.76

Jul 1, 2025

1.21.1.16

May 29, 2025

1.21.0.555

May 19, 2025

1.20.1.97

Mar 31, 2025

1.20.0.543

Feb 26, 2025

1.19.2.32

Feb 9, 2025

1.19.1.26

Jan 12, 2025

1.19.0.561

Dec 19, 2024

1.18.0.524

Oct 10, 2024

1.17.1.40

Aug 23, 2024

1.17.0.495

Aug 6, 2024

1.16.2.2

Jun 25, 2024

1.16.1.7

Jun 18, 2024

1.16.0.526

Jun 4, 2024

1.15.3.5

Aug 1, 2024

1.15.2.12

Jun 25, 2024

1.15.1.15

Apr 2, 2024

1.15.0.479

Mar 26, 2024

1.14.0.493

Jan 23, 2024

1.13.0.463

Nov 22, 2023

1.12.1.10

Oct 24, 2023

1.12.0.480

Oct 3, 2023

1.11.0.587

Aug 8, 2023

1.10.0.494

May 30, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

habana_media_loader-1.24.0.1007-py3-none-any.whl (808.4 kB view details)

Uploaded Apr 16, 2026 Python 3

File details

Details for the file habana_media_loader-1.24.0.1007-py3-none-any.whl.

File metadata

Download URL: habana_media_loader-1.24.0.1007-py3-none-any.whl
Upload date: Apr 16, 2026
Size: 808.4 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.1.0 CPython/3.8.20

File hashes

Hashes for habana_media_loader-1.24.0.1007-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f7d2caa7c343158a7b50cd60a843daab1ba2f39098ca8b2bd214f1a8b3f0ce78`
MD5	`4b3969613b50b429ed035d17ad77aa49`
BLAKE2b-256	`6d30938347ead731ebbacf4798512b1d77292176026dc8c568b9f5abf6ad4368`

See more details on using hashes here.

habana-media-loader 1.24.0.1007

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Habana Media Python package

Structure

Media package (habana_frameworks.mediapipe and habana_frameworks.medialoaders)

First part of media package contains media pipe which is responsible for media processing on device.

Second part of media package contains pre built media pipe for pytorch.

torch folder contains media_dataloader_mediapipe containing `HPUMediaPipe` which can be used to create resnet and SSD media pipe for pytorch

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes

habana-media-loader 1.24.0.1007

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

Habana Media Python package

Structure

Media package (habana_frameworks.mediapipe and habana_frameworks.medialoaders)

First part of media package contains media pipe which is responsible for media processing on device.

Second part of media package contains pre built media pipe for pytorch.

torch folder contains media_dataloader_mediapipe containing HPUMediaPipe which can be used to create resnet and SSD media pipe for pytorch

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distributions

Built Distribution

File details

File metadata

File hashes

torch folder contains media_dataloader_mediapipe containing `HPUMediaPipe` which can be used to create resnet and SSD media pipe for pytorch