Large Scale 3d Convolution Net Inference
Project description
chunkflow
Chunk operations for large scale 3D image dataset processing
Motivation
Benefited from the rapid development of microscopy technologies, we can acquire large scale 3D volumetric datasets with both high resolution and large field of view. These 3D image datasets are too big to be processed in a single computer, and distributed processing is required. In most cases, the image dataset could be choped to chunks with/without overlap and distributed to computers for processing. This package provide a framework to perform distributed chunk processing for large scale 3D image dataset. For each task in a single machine, it has a few composable chunk operators for flexible real world usage.
Features
- Composable Commandline interface. The chunk operators could be freely composed in commandline for flexible usage. This is also super useful for tests and experiments.
- Decoupled frontend and backend. The computational heavy backend could be any computer with internet connection and Amazon Web Services (AWS) authentication.
Some Typical Operators
- Convolutional Network Inference. Currently, we support PyTorch and pznet
- Task Generator. Fetch task from AWS SQS.
- Cutout service. Cutout chunk from datasets formatted as neuroglancer precomputed using cloudvolume
- Save. Save chunk to neuroglancer precomputed.
- Save Images. Save chunk as a serials of PNG images in local directory.
- Real File. Read image from hdf5 and tiff files.
- View. View chunk using cloudvolume viewer.
- Mask. Mask out the chunk using a precomputed dataset.
- Cloud Watch. Realtime speedometer using AWS CloudWatch.
Use specific GPU device
We can simply set an environment variable to use specific GPU device.
CUDA_VISIBLE_DEVICES=2 chunkflow
Produce tasks to AWS SQS queue
in bin
,
python produce_tasks.py --help
Terminology
- patch: ndarray as input to ConvNet. Normally it is pretty small due to the limited memory capacity of GPU.
- chunk: ndarray with global offset and arbitrary shape.
- block: the array with a shape and global offset aligned with storage backend. The block could be saved directly to storage backend. The alignment with storage files ensures that there is no writting conflict when saved parallelly.
Development
Create a new release in PyPi
python setup.py sdist
twine upload dist/chunkflow-version.tar.gz
Citation
If you used this tool and is writing a paper, please cite this paper:
@article{wu2019chunkflow,
title={Chunkflow: Distributed Hybrid Cloud Processing of Large 3D Images by Convolutional Nets},
author={Wu, Jingpeng and Silversmith, William M and Seung, H Sebastian},
journal={arXiv preprint arXiv:1904.10489},
year={2019}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file chunkflow-0.5.0.tar.gz
.
File metadata
- Download URL: chunkflow-0.5.0.tar.gz
- Upload date:
- Size: 48.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.33.0 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | fdafc9a360f3e8668ed02bbb9416203466b1b022b6b12b9dff46aaebdc4878d2 |
|
MD5 | 33ddcc22f854994c1394b7a28b223bd6 |
|
BLAKE2b-256 | 3c96f44a0d4d4e3d32347e070a5430300abd6db965051a1ea84b0f6807bdf83d |