Skip to main content

Large Scale 3d Convolution Net Inference

Project description

chunkflow

Documentation Status Build Status PyPI version Coverage Status

Chunk operations for large scale 3D image dataset processing

Introduction

3D image dataset could be too large to be processed in a single computer, and distributed processing was required. In most cases, the image dataset could be choped to chunks and distributed to computers for processing. This package provide a framework to perform distributed chunk processing for large scale 3D image dataset. For each task in a single machine, it has a few composable chunk operators for flexible real world usage, including convolutional network inference and meshing of segmentation.

Features

  • Decoupled frontend and backend. The computational heavy backend could be any computer with internet connection and Amazon Web Services (AWS) authentication.
  • Composable Commandline interface. The chunk operators could be freely composed in commandline for flexible usage. This is also super useful for tests and experiments.

Usage

Installation

This package was registered in PyPi, just run a simple command to install:

pip install chunkflow

or download and install manually:

pip install .

Note that do not install using python setup.py install since some packages are not installable using it.

Run unittest

python -m unittest

Get Help

chunkflow --help

get help for commands: chunkflow command --help

Examples

The commands could be composed and used flexiblly. The first command should be a generator though.

chunkflow create-chunk view
chunkflow create-chunk 

A Typical pipeline to run ConvNet inference is something like:

chunkflow --verbose fetch-task --queue-name="$QUEUE_NAME" --visibility-timeout=$VISIBILITY_TIMEOUT cutout --volume-path="$IMAGE_LAYER_PATH" --expand-margin-size 4 64 64 inference --convnet-model=your-model-name --convnet-weight-path=path/of/net/weight --patch-size 20 256 256 --patch-overlap 4 64 64 --output-key your-output-key --framework='identity' --batch-size 2 crop-margin save --volume-path="$OUTPUT_LAYER_PATH" --upload-log --nproc 4 --create-thumbnail delete-task-in-queue

Some Typical Operators

  • Convolutional Network Inference. Currently, we support PyTorch and pznet
  • Task Generator. Fetch task from AWS SQS.
  • Cutout service. Cutout chunk from datasets formatted as neuroglancer precomputed using cloudvolume
  • Save. Save chunk to neuroglancer precomputed.
  • Save Images. Save chunk as a serials of PNG images in local directory.
  • Real File. Read image from hdf5 and tiff files.
  • View. View chunk using cloudvolume viewer.
  • Mask. Mask out the chunk using a precomputed dataset.
  • Cloud Watch. Realtime speedometer using AWS CloudWatch.

Use specific GPU device

We can simply set an environment variable to use specific GPU device.

CUDA_VISIBLE_DEVICES=2 chunkflow

Produce tasks to AWS SQS queue

in bin,

python produce_tasks.py --help

Terminology

  • patch: ndarray as input to ConvNet. Normally it is pretty small due to the limited memory capacity of GPU.
  • chunk: ndarray with global offset and arbitrary shape.
  • block: the array with a shape and global offset aligned with storage backend. The block could be saved directly to storage backend. The alignment with storage files ensures that there is no writting conflict when saved parallelly.

Development

Create a new release in PyPi

python setup.py sdist
twine upload dist/chunkflow-version.tar.gz

Add a new operator

To be added.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chunkflow-0.3.2.tar.gz (39.8 kB view details)

Uploaded Source

File details

Details for the file chunkflow-0.3.2.tar.gz.

File metadata

  • Download URL: chunkflow-0.3.2.tar.gz
  • Upload date:
  • Size: 39.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.6.5

File hashes

Hashes for chunkflow-0.3.2.tar.gz
Algorithm Hash digest
SHA256 b283f31a6cf1c3644ab206af67aa1d892ba8b68dcf547c2e6d267ee632044846
MD5 23794759f83848208578fd0cb09c97a6
BLAKE2b-256 f14305b00b451e0314f32cdf1fee3f448a4c931a28d7ed74130b79edaa2775b1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page