Dataloader using Habana hardware media pipeline
Project description
Habana Media Python package
habana_media_loader is a package designed for easy integration of media processing on Gaudi2.
Main entry point (Python import) is habana_frameworks.mediapipe module that contains all the necessary functions to work with Gaudi2.
Structure
Properly built wheel contains:
habana_frameworkspython namespace (with all the folder structure inside).mediapipefolder catering media execution on device andmedialoaderfolder catering pre-built mediapipe for pytorch framework.- proper licensing.
Media package (habana_frameworks.mediapipe and habana_frameworks.medialoaders)
First part of media package contains media pipe which is responsible for media processing on device.
Following are the steps to create mediapipe
- Create a class derived from
habana_frameworks.mediapipesuper class. - In the class constructor initialize super class.
- Create nodes required for execution along with it's parameters.
- Define a method
definegraph()which defines the data flow between nodes created in constructor.
Following are the steps to execute a standalone mediapipe
- Instantiate an object of defined mediapipe class.
- Build the mediapipe by executing
build()method of mediapipe object. - Initialize the iterator by calling
iter_init()method of mediapipe object. - To produce one batch of dataset, execute
run()method of mediapipe object. Eachrun()method call executes and produces one batch of device tensors. - To view or manipulate tensors on host
as_cpu()method of device tensor object can be called, which yields host tensor object. - For numpy manipulation
as_nparray()method of host tensor object can be called to get a numpy host array.
Example:
from habana_frameworks.mediapipe import fn
from habana_frameworks.mediapipe.mediapipe import MediaPipe
from habana_frameworks.mediapipe.media_types import imgtype as it
from habana_frameworks.mediapipe.media_types import dtype as dt
from habana_frameworks.mediapipe.media_types import layout as lt
import time
class myMediaPipe(MediaPipe):
def __init__(self, device, queue_depth, batch_size, channel, height, width):
super(
myMediaPipe,
self).__init__(
device,
queue_depth,
batch_size,
channel,
height,
width,
self.__class__.__name__,
layout=lt.NHWC)
mediapipe_seed = int(time.time_ns() % (2**31 - 1))
# create reader node and setting it's params
self.input = fn.ReadImageDatasetFromDir(dir="/path/to/jpeg/dir/",
format="JPEG",
shuffle=True,
seed=mediapipe_seed)
# create decoder node and set it's params
self.decode = fn.ImageDecoder(output_format=it.RGB_P,
resize=[224, 224])
# create transpose node and set it's params
self.transpose = fn.Transpose(permutation=[2, 0, 1, 3], tensorDim=4)
def definegraph(self):
# define actual data flow of nodes
jpegs, data = self.input()
images = self.decode(jpegs)
images = self.transpose(images)
# return output nodes of the graph
return images, data
# test specific params
batch_size = 4
img_width = 224
img_height = 224
channels = 3
queue_depth = 3
iterations = 5
# instantiating defined class
pipe = myMediaPipe("hpu", queue_depth, batch_size,
channels, img_height, img_width)
# build the pipe
pipe.build()
# initialize iterator
pipe.iter_init()
batch_count = 0
while(batch_count < iterations):
try:
# exectute and produce one batch of dataset.
images, labels = pipe.run()
# images and labels are device tensors.
except StopIteration:
print("stop iteration")
break
# as cpu will bring the device data to host and produce host tensors
# as_nparray will convert host tensors to numpy array.
images = images.as_cpu().as_nparray()
labels = labels.as_cpu().as_nparray()
batch_count = batch_count + 1
Second part of media package contains pre built media pipe for pytorch.
torch folder contains media_dataloader_mediapipe containing HPUMediaPipe which can be used to create resnet and SSD media pipe for pytorch
Following are the steps to use HPUMediaPipe for pytorch
- Import
HPUMediaPipefromhabana_frameworks.medialoaders.torch.media_dataloader_mediapipe - Instantiate an object of
HPUMediaPipewith following parameters:- a_torch_transforms: transforms to be applied on mediapipe.
- a_root: directory path from which to load the images.
- a_annotation_file: path from which to load annotation file for SSD.
- a_batch_size: mediapipe output batch size.
- a_shuffle: whether images have to be shuffled. <True/False>
- a_drop_last: whether to drop the last incomplete batch or round up.<True/False>
- a_prefetch_count: queue depth for media processing.
- a_num_instances: number of devices.
- a_instance_id: instance id of current device.
- a_model_ssd: whether mediapipe is to be created for SSD. <True/False>
- a_device: media device to run mediapipe on.
- Separate
HPUMediaPipeobjects can be created for training and validation. - Instantiate an object of
HPUResnetPytorchIterator(for resnet) orHPUSsdPytorchIterator(for SSD) with following parameters- mediapipe: media pipe object.
Example for resnet media pipe:
from habana_frameworks.medialoaders.torch.media_dataloader_mediapipe import HPUMediaPipe
from habana_frameworks.mediapipe.plugins.iterator_pytorch import HPUResnetPytorchIterator
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])
torch_transforms = transforms.Compose([
transforms.RandomResizedCrop(224),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
normalize,
])
root = "/JPEG/path"
batch_size = 256
shuffle = True
drop_last = False
prefetch_factor = 3
num_instances = 1
instance_id = 0
pipeline = HPUMediaPipe(a_torch_transforms=torch_transforms, a_root=root, a_batch_size=batch_size,
a_shuffle=shuffle, a_drop_last=drop_last, a_prefetch_count=prefetch_factor,
a_num_instances=num_instances, a_instance_id=instance_id, a_device="hpu")
iterator = HPUResnetPytorchIterator(mediapipe=pipeline)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file habana_media_loader-1.23.0.695-py3-none-any.whl.
File metadata
- Download URL: habana_media_loader-1.23.0.695-py3-none-any.whl
- Upload date:
- Size: 808.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.8.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
255d3b1ecdef53612b143a30fa9709acb712d21ee8ccb389c25fad942c6a586d
|
|
| MD5 |
dbb3ed2dbd0763308bc27a77b24b80c1
|
|
| BLAKE2b-256 |
7c6a44d5c3f109866874d87525bccc5b78de734d07ef319c6c6fc0d51b5d7665
|