Skip to main content

General compute framework for Tenstorrent devices

Project description

tt-metal CI Ask DeepWiki

Hardware | Install | Discord | Join Us | Bounty $

ttnn logo

TT-NN is a Python & C++ Neural Network OP library.

API Reference | Model Demos

Quick Links

Featured Models

The Models team is focused on developing the following models, optimizing them for performance, accuracy, and compatibility. Follow each model link for more details.

[!IMPORTANT] For a full model list see the Model Matrix, or visit the Developer Hub.

[!NOTE] Performance Metrics:

  • Time to First Token (TTFT) measures the time (in milliseconds) it takes to generate the first output token after input is received.
  • T/S/U (Tokens per Second per User): Represents the throughput of first-token generation after prefill. It is calculated as 1 / inter-token latency.
  • T/S (Tokens per Second): Represents total token throughput, calculated as T/S = T/S/U x batch size.
  • TP (Tensor Parallel) and DP (Data Parallel): Indicate the parallelization factors across multiple devices.
  • Reported LLM Performance: Based on an input sequence length of 128 tokens for all models.
  • Performance Data Source: Metrics were collected using the tt-metal model demos (linked above). Results may vary when using other runtimes such as the vLLM inference server.

Llama 3.3 70B (TP=8)

Batch Hardware TTFT (MS) T/S/U Target
T/S/U
T/S TT-Metalium Release vLLM Tenstorrent Repo Release
32 QuietBox (Wormhole) 159 15.9 20 508.8 v0.59.0-rc53 f028da1

Qwen 2.5 7B (TP=2)

Batch Hardware TTFT (MS) T/S/U Target
T/S/U
T/S TT-Metalium Release vLLM Tenstorrent Repo Release
32 n300 (Wormhole) 109 22.1 30 707.2 v0.62.0-rc35 ced0161

Qwen 2.5 72B (TP=8)

Batch Hardware TTFT (MS) T/S/U Target
T/S/U
T/S TT-Metalium Release vLLM Tenstorrent Repo Release
32 QuietBox (Wormhole) 223 15.4 20 492.8 v0.62.0-rc25 e7c329b

Whisper (distil-large-v3)

Batch Hardware TTFT (MS) T/S/U Target
T/S/U
T/S TT-Metalium Release
1 n150 (Wormhole) 232 58.1 45 58.1 v0.59.0-rc52
1 p150 (Blackhole) 113 101.5 101.5 v0.62.0-dev20251015

Mixtral 8x7B (TP=8)

Batch Hardware TTFT (MS) T/S/U Target
T/S/U
T/S TT-Metalium Release
32 QuietBox (Wormhole) 122 24.9 33 796.8 v0.62.0-dev20251015

Blackhole software optimization is under active development. Please join us in shaping the future of open source AI!
[Discord] [Developer Hub]

For more information regarding vLLM installation and environment creation visit the Tenstorrent vLLM repository.

Model Updates

For the latest model updates and features, please see MODEL_UPDATES.md

Model Bring-Up and Testing

For information on initial model procedures, please see Model Bring-Up and Testing

TT-NN Tech Reports

Benchmarks


TT-Metalium logo

TT-Metalium is our low-level programming model, enabling kernel development for Tenstorrent hardware.

Programming Guide | API Reference

Getting started

Get started with simple kernels.

TT-Metalium Tech Reports

TT-Metalium Programming Examples

Hello World

Add Integers

Simple Tensor Manipulation

DRAM Data Movement

Eltwise

Matmul

Tools and Instruments

TT-NN Visualizer

A comprehensive tool for visualizing and analyzing model execution, offering interactive graphs, memory plots, tensor details, buffer overviews, operation flow graphs, and multi-instance support with file or SSH-based report loading.

TT-Exalens

The TT-Exalens repository describes TT-Lensium, a low-level debugging tool for Tenstorrent hardware. It allows developers to access and communicate with Wormhole and Blackhole devices.

TT-SMI

The TT-SMI repository describes the Tenstorrent System Management Interface. This command line utility can interact with Tenstorrent devices on host. TT-SMI provides an easy to use interface displaying device, telemetry, and firmware information.

Model Explorer

The Model Explorer is an intuitive and hierarchical visualization tool using model graphs. It organizes model operations into nested layers and provides features for model exploration and debugging.

Tracy Profiler

The Tracy Profiler is a real-time nanosecond resolution, remote telemetry, hybrid frame, and sampling tool. Tracy supports profiling CPU, GPU, memory allocation, locks, context switches, and more.

Kernel Print Debug

DPRINT can print variables, addresses, and circular buffer data from kernels to the host terminal or log file. This feature is useful for debugging issues with kernels.

Watcher

Watcher monitors firmware and kernels for common programming errors, and overall device status. If an error or hang occurs, Watcher displays log data of that occurrence.

Inspector

Inspector provides insights into host runtime. It logs necessary data for investigation and allows queries to host runtime data.

Related Tenstorrent Projects

Latest Releases

Release Release Date
0.65.0 ETA Nov End
0.64.0 Oct 29, 2025
0.63.0 Sep 22, 2025
0.62.2 Aug 20, 2025
0.61.0 Skipped
0.60.1 Jul 22, 2025
0.59.0 Jun 18, 2025
0.58.0 May 13, 2025
0.57.0 Apr 15, 2025
0.56.0 Mar 7, 2025

Visit the releases folder for details on releases, release notes, and estimated release dates.

Tenstorrent Bounty Program Terms and Conditions

This repo is a part of Tenstorrent’s bounty program. If you are interested in helping to improve tt-metal, please make sure to read the Tenstorrent Bounty Program Terms and Conditions before heading to the issues tab. Look for the issues that are tagged with both “bounty” and difficulty level!

License

TT-Metalium and TTNN are licensed under the Apache 2.0 License, as detailed in LICENSE and LICENSE_understanding.txt.

Some distributable forms of this project—such as manylinux-compliant wheels—may need to bundle additional libraries beyond the standard Linux system libraries. For example:

  • libnuma
  • libhwloc
  • openmpi (when built with multihost support)
  • libevent (when built with multihost support)

These libraries are bound by their own license terms.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ttnn-0.64.3-cp310-cp310-manylinux_2_34_x86_64.whl (32.1 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.34+ x86-64

File details

Details for the file ttnn-0.64.3-cp310-cp310-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for ttnn-0.64.3-cp310-cp310-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 4495ba13d66e338f91602d3aba52359108df78b4ae3be7e6a4d8c7232ccaab53
MD5 5fe8f4002dfea865526b73a70310d482
BLAKE2b-256 6bba8050dc5d73201709ec732a557c135ed294674d20ea315dd9249e708d7f07

See more details on using hashes here.

Provenance

The following attestation bundles were made for ttnn-0.64.3-cp310-cp310-manylinux_2_34_x86_64.whl:

Publisher: package-and-release.yaml on tenstorrent/tt-metal

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page