Skip to main content

Extending High-Level Synthesis for Task-Parallel Programs

Project description

TAPA

CI install Documentation Status

TAPA is a dataflow HLS framework that features fast compilation, expressive programming model and generates high-frequency FPGA accelerators.

TAPA Framework

High-Frequency

  • TAPA explicitly decouples communication and computation for better QoR.

  • TAPA integrates the AutoBridge floorplanner to optimize the RTL generation process.

  • TAPA achieves higher the frequency on average compared to Vivado. 1

Speed

  • TAPA compiles faster than Vitis HLS. 2

  • TAPA provides faster software simulation than Vitis HLS.2

  • TAPA provides faster RTL simulation than Vitis.

  • [in-progress] TAPA is integrating RapidStream that is up to 10× faster than Vivado.3

Expressiveness

  • TAPA extends the Vitis HLS syntax for richer expressiveness at the C++ level.

  • TAPA provides dedicated APIs for arbitrary external memory access patterns.

  • TAPA allows users to explicitly specify parallelism.

  • In addition to static burst analysis, TAPA supports runtime burst detectuion by transparently merging small memory transactions into large bursts.

HBM-Specific Optimizations

  • TAPA significantly reduce the area overhead of HBM interface IPs compared to Vitis HLS.

  • TAPA includes an automated design space exploration tool to balance the resource pressure and the wire pressure for HBM FPGAs.

  • TAPA automatically select the physical channel for each top-level argument of your accelerator.

Successful Cases

  • Serpens, DAC'22, achieves 270 MHz on the Xilinx Alveo U280 HBM board when using 24 HBM channels. The Vitis HLS baseline failed in routing.
  • Sextans, FPGA'22, achieves 260 MHz on the Xilinx Alveo U250 board when using 4 DDR channels. The Vivado baseline achieves only 189 MHz.
  • SPLAG, FPGA'22, achieves up to a 4.9× speedup over state-of-the-art FPGA accelerators, up to a 2.6× speedup over 32-thread CPU running at 4.4 GHz, and up to a 0.9× speedup over an A100 GPU (that has 4.1× power budget and 3.4× HBM bandwidth).
  • AutoSA Systolic-Array Compiler, FPGA'21: AutoSA Frequency Figure
  • KNN, FPT'20, achieves 252 MHz on the Xilinx Alveo U280 board. The Vivado baseline achieves only 165 MHz.

Getting Started

TAPA Publications

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tapa-0.0.20240301.1.tar.gz (99.8 kB view hashes)

Uploaded Source

Built Distribution

tapa-0.0.20240301.1-py3-none-any.whl (128.3 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page