Skip to main content

Explore the energy-efficient dataflow scheduling for neural networks.

Project description

https://travis-ci.org/stanford-mast/nn_dataflow.svg?branch=master https://coveralls.io/repos/github/stanford-mast/nn_dataflow/badge.svg?branch=master

Neural Network Dataflow Scheduling

This Python tool allows you to explore the energy-efficient dataflow scheduling for neural networks (NNs), including array mapping, loop blocking and reordering, and parallel partitioning.

For hardware, we assume an Eyeriss-style NN accelerator [Chen16], i.e., a 2D array of processing elements (PEs) with a local register file in each PE, and a global SRAM buffer shared by all PEs. We further support a tiled architecture with multiple nodes that can partition and process the NN computations in parallel. Each node is an Eyeriss-style engine as above.

In software, we decouple the dataflow scheduling into three subproblems:

  • Array mapping, which deals with mapping one 2D convolution computation (one 2D ifmap convolves with one 2D filter to get one 2D ofmap) onto the hardware PE array. We support row stationary mapping [Chen16].

  • Loop blocking and reordering, which decides the order between all 2D convolutions by blocking and reordering the nested loops. We support exhaustive search over all blocking and reordering schemes [Yang16], and analytical bypass solvers [Gao17].

  • Partitioning, which partitions the NN computations for parallel processing. We support batch partitioning, fmap partitioning, output partitioning, input partitioning, and the combination between them (hybrid) [Gao17]. We use layer-wise greedy beam search.

See the details in our ASPLOS’17 paper [Gao17].

If you use this tool in your work, we kindly request that you reference our paper(s) below, and send us a citation of your work.

  • Gao et al., “TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory”, in ASPLOS, April 2017 [Gao17].

Usage

First, define the NN structure in nn_dataflow/nns. We already defined several popular NNs for you, including AlexNet, VGG-16, GoogLeNet, ResNet-152, etc.

Then, use nn_dataflow/tools/nn_dataflow_search.py to search for the optimal dataflow for the NN. For detailed options, type:

> python ./nn_dataflow/tools/nn_dataflow_search.py -h

You can specify NN batch size and word size, PE array dimensions, number of tile nodes, register file and global buffer capacity, and the energy cost of all components. Note that, the energy cost of array bus should be the average energy of transferring the data from the buffer to one PE, not local neighbor transfer; the unit static energy cost should be the static energy of all nodes in one clock cycle.

Other options include:

  • -g, --goal: E, D, or ED. the optimization goal, e(nergy), d(elay), or ED product.

  • --mem-type: 2D or 3D. With 2D memory, memory channels are only on the four corners of the chip; with 3D memory, memory channels are on the top of all tile nodes (one per each).

  • --bus-width: the multicast bus bit width in the PE array for one data type. Set to 0 to ignore multicast overheads.

  • --dram-bw: float or inf. Total DRAM bandwidth for all tile nodes, in bytes per cycle.

  • --disable-bypass: a combination of i, o, f, whether to disallow global buffer bypass for ifmaps, ofmaps, and weights.

  • --solve-loopblocking: whether to use analytical bypass solvers for loop blocking and reordering. See [Gao17].

  • --hybrid-partitioning: whether to use hybrid partitioning in [Gao17]. If not enabled, use naive partitioning, i.e., fmap partitioning for CONV layers, and output partitioning for FC layers.

  • --batch-partitioning and --ifmap-partitioning: whether the hybrid partitioning also explores batch and input partitioning.

Code Structure

  • nn_dataflow
    • core
      • Top-level dataflow exploration: nn_dataflow, nn_dataflow_scheme.

      • Layer scheduling: scheduling.

      • Array mapping: map_strategy.

      • Loop blocking and reordering: loop_blocking, loop_blocking_scheme, loop_blocking_solver.

      • Partitioning: partition, partition_scheme.

      • Network and layer: network, layer.

    • nns: example NN definitions.

    • tests: unit tests.

    • tools: executables.

Verification and Testing

To verify the tool against the Eyeriss result [Chen16], see nn_dataflow/tests/dataflow_test/test_nn_dataflow.py.

To run (unit) tests, do one of the following:

> python -m unittest discover

> python -m pytest

> pytest

To check code coverage with pytest-cov plug-in:

> pytest --cov=nn_dataflow

References

[Gao17] (1,2,3,4,5,6)

Gao, Pu, Yang, Horowitz, and Kozyrakis, TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory, in ASPLOS. April, 2017.

[Yang16]

Yang, Pu, Rister, Bhagdikar, Richardson, Kvatinsky, Ragan-Kelley, Pedram, and Horowitz, A Systematic Approach to Blocking Convolutional Neural Networks, arXiv preprint, 2016.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nn_dataflow-1.6.tar.gz (99.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

nn_dataflow-1.6-py2-none-any.whl (157.9 kB view details)

Uploaded Python 2

File details

Details for the file nn_dataflow-1.6.tar.gz.

File metadata

  • Download URL: nn_dataflow-1.6.tar.gz
  • Upload date:
  • Size: 99.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/18.5 requests-toolbelt/0.9.1 tqdm/4.30.0 CPython/2.7.10

File hashes

Hashes for nn_dataflow-1.6.tar.gz
Algorithm Hash digest
SHA256 7c599fb840647a74fb62e246ec524923278e3b75e4a8104db823dcc1fbfdc31e
MD5 57bde6ceda80235d5674d20a1a349002
BLAKE2b-256 37e9ecbc6357176d39ebac202b1c05fb7383f358fceb5b1e2a3c9b34c3f0302e

See more details on using hashes here.

File details

Details for the file nn_dataflow-1.6-py2-none-any.whl.

File metadata

  • Download URL: nn_dataflow-1.6-py2-none-any.whl
  • Upload date:
  • Size: 157.9 kB
  • Tags: Python 2
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/18.5 requests-toolbelt/0.9.1 tqdm/4.30.0 CPython/2.7.10

File hashes

Hashes for nn_dataflow-1.6-py2-none-any.whl
Algorithm Hash digest
SHA256 72ac07c132bcfd105e187307357407249cfed1458789668c3cb81e4a838767f5
MD5 c7f1cd598f7586fcfb9c047ab0897af0
BLAKE2b-256 2fee504eab6fc9920192f1ed46d91c49f7f09a61f421fc02a7981f41ce8b2edd

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page