General compute framework for Tenstorrent devices
Project description
Hardware | Install | Discord | Join Us | Bounty $
TT-NN is a Python & C++ Neural Network OP library.
Quick Links
Featured Models
The Models team is focused on developing the following models, optimizing them for performance, accuracy, and compatibility. Follow each model link for more details.
[!IMPORTANT] For a full model list see the Model Matrix, or visit the Developer Hub.
[!NOTE] Performance Metrics:
- Time to First Token (TTFT) measures the time (in milliseconds) it takes to generate the first output token after input is received.
- T/S/U (Tokens per Second per User): Represents the throughput of first-token generation after prefill. It is calculated as 1 / inter-token latency.
- T/S (Tokens per Second): Represents total token throughput, calculated as T/S = T/S/U x batch size.
- TP (Tensor Parallel) and DP (Data Parallel): Indicate the parallelization factors across multiple devices.
- Reported LLM Performance: Based on an input sequence length of 128 tokens for all models.
- Performance Data Source: Metrics were collected using the tt-metal model demos (linked above). Results may vary when using other runtimes such as the vLLM inference server.
Llama 3.3 70B (TP=32)
| Batch | Hardware | TTFT (MS) | T/S/U | Target T/S/U |
T/S | TT-Metalium Release | vLLM Tenstorrent Repo Release |
|---|---|---|---|---|---|---|---|
| 32 | Galaxy (Wormhole) | 53 | 72.5 | 80 | 2268.8 | v0.65.0-rc7 | 59be953 |
Qwen 2.5 7B (TP=2)
| Batch | Hardware | TTFT (MS) | T/S/U | Target T/S/U |
T/S | TT-Metalium Release | vLLM Tenstorrent Repo Release |
|---|---|---|---|---|---|---|---|
| 32 | n300 (Wormhole) | 109 | 22.1 | 30 | 707.2 | v0.62.0-rc35 | ced0161 |
Qwen 2.5 72B (TP=8)
| Batch | Hardware | TTFT (MS) | T/S/U | Target T/S/U |
T/S | TT-Metalium Release | vLLM Tenstorrent Repo Release |
|---|---|---|---|---|---|---|---|
| 32 | QuietBox (Wormhole) | 223 | 15.4 | 20 | 492.8 | v0.62.0-rc25 | e7c329b |
Whisper (distil-large-v3)
| Batch | Hardware | TTFT (MS) | T/S/U | Target T/S/U |
T/S | TT-Metalium Release |
|---|---|---|---|---|---|---|
| 1 | n150 (Wormhole) | 163 | 105.0 | 45 | 105.0 | v0.65.0-dev20251208 |
| 1 | p150 (Blackhole) | 63 | 263.4 | 263.4 | v0.65.0-dev20251208 |
Mixtral 8x7B (TP=8)
| Batch | Hardware | TTFT (MS) | T/S/U | Target T/S/U |
T/S | TT-Metalium Release |
|---|---|---|---|---|---|---|
| 32 | QuietBox (Wormhole) | 122 | 24.9 | 33 | 796.8 | v0.62.0-dev20251015 |
Blackhole software optimization is under active development. Please join us in shaping the future of open source AI!
[Discord] [Developer Hub]
For more information regarding vLLM installation and environment creation visit the Tenstorrent vLLM repository.
Model Updates
For the latest model updates and features, please see MODEL_UPDATES.md
Model Bring-Up and Testing
For information on initial model procedures, please see Model Bring-Up and Testing
TT-NN Tech Reports
- Advanced Performance Optimizations for Models (updated March 4th, 2025)
- Programming Mesh of Devices (updated Sept 9th, 2024)
- ViT Implementation in TT-NN on GS (updated Sept 22nd, 2024)
- LLMs Bring up in TT-NN (updated Oct 29th, 2024)
- CNN Bring up & Optimization in TT-NN (updated Jan 22nd, 2025)
Benchmarks
- Matrix Multiply FLOPS on Wormhole and Blackhole (updated June 17th, 2025)
TT-Metalium is our low-level programming model, enabling kernel development for Tenstorrent hardware.
Getting started
Get started with simple kernels.
TT-Metalium Tech Reports
- Matrix Engine (updated Sept 6th, 2024)
- Data Formats (updated Sept 7th, 2024)
- Reconfiguring Data Formats (updated Oct 17th, 2024)
- Handling special floating-point numbers (updated Oct 5th, 2024)
- Allocator (Updated Dec 19th, 2024)
- Tensor Layouts (updated Sept 6th, 2024)
- Saturating DRAM Bandwidth (updated Sept 6th, 2024)
- Flash Attention on Wormhole (updated Sept 6th, 2024)
- CNNs on TT Architectures (updated Sept 6th, 2024)
- Ethernet and Multichip Basics (Updated Sept 20th, 2024)
- Blackhole Bring-Up Programming Guide (Updated Dec 18th, 2024)
- Sub-Devices (Updated Jan 7th, 2025)
TT-Metalium Programming Examples
Hello World
Add Integers
Simple Tensor Manipulation
DRAM Data Movement
Eltwise
Matmul
- Matmul OP on a Single_core
- Matmul OP on Multi_core (Basic)
- Matmul Multi_core Reuse (Optimized)
- Matmul Multi_core Multi-Cast (Optimized)
Tools and Instruments
TT-NN Visualizer
A comprehensive tool for visualizing and analyzing model execution, offering interactive graphs, memory plots, tensor details, buffer overviews, operation flow graphs, and multi-instance support with file or SSH-based report loading.
TT-Exalens
The TT-Exalens repository describes TT-Lensium, a low-level debugging tool for Tenstorrent hardware. It allows developers to access and communicate with Wormhole and Blackhole devices.
TT-SMI
The TT-SMI repository describes the Tenstorrent System Management Interface. This command line utility can interact with Tenstorrent devices on host. TT-SMI provides an easy to use interface displaying device, telemetry, and firmware information.
Model Explorer
The Model Explorer is an intuitive and hierarchical visualization tool using model graphs. It organizes model operations into nested layers and provides features for model exploration and debugging.
Tracy Profiler
The Tracy Profiler is a real-time nanosecond resolution, remote telemetry, hybrid frame, and sampling tool. Tracy supports profiling CPU, GPU, memory allocation, locks, context switches, and more.
Kernel Print Debug
DPRINT can print variables, addresses, and circular buffer data from kernels to the host terminal or log file. This feature is useful for debugging issues with kernels.
Watcher
Watcher monitors firmware and kernels for common programming errors, and overall device status. If an error or hang occurs, Watcher displays log data of that occurrence.
Inspector
Inspector provides insights into host runtime. It logs necessary data for investigation and allows queries to host runtime data.
Related Tenstorrent Projects
Latest Releases
| Release | Release Date | FW Version |
|---|---|---|
| 0.65.0 | ETA Dec 15, 2025 | 19.2.0 |
| 0.64.5 | Dec 1, 2025 | 18.12.0 |
| 0.64.4 | Nov 24, 2025 | 18.12.0 |
| 0.64.3 | Nov 14, 2025 | 18.12.0 |
| 0.64.0 | Oct 29, 2025 | 18.12.0 |
| 0.63.0 | Sep 22, 2025 | 18.8.0 |
| 0.62.2 | Aug 20, 2025 | 18.6.0 |
| 0.61.0 | Skipped | - |
| 0.60.1 | Jul 22, 2025 | 18.6.0 |
| 0.59.0 | Jun 18, 2025 | - |
| 0.58.0 | May 13, 2025 | - |
| 0.57.0 | Apr 15, 2025 | - |
| 0.56.0 | Mar 7, 2025 | - |
Visit the releases folder for details on releases, release notes, and estimated release dates.
Tenstorrent Bounty Program Terms and Conditions
This repo is a part of Tenstorrent’s bounty program. If you are interested in helping to improve tt-metal, please make sure to read the Tenstorrent Bounty Program Terms and Conditions before heading to the issues tab. Look for the issues that are tagged with both “bounty” and difficulty level!
License
TT-Metalium and TTNN are licensed under the Apache 2.0 License, as detailed in LICENSE and LICENSE_understanding.txt.
Some distributable forms of this project—such as manylinux-compliant wheels—may need to bundle additional libraries beyond the standard Linux system libraries. For example:
- libnuma
- libhwloc
- openmpi (when built with multihost support)
- libevent (when built with multihost support)
These libraries are bound by their own license terms.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ttnn-0.65.1rc5-cp310-cp310-manylinux_2_34_x86_64.whl.
File metadata
- Download URL: ttnn-0.65.1rc5-cp310-cp310-manylinux_2_34_x86_64.whl
- Upload date:
- Size: 33.1 MB
- Tags: CPython 3.10, manylinux: glibc 2.34+ x86-64
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d422514f044af0397de495552e39ff197bb30d013fb6d446706a515fade68a8a
|
|
| MD5 |
377219898e3e4438ef170297d49306a0
|
|
| BLAKE2b-256 |
df0c52029a026da5dadf8c85c50de9ada9b87a56da4b93392b195a7580689c67
|
Provenance
The following attestation bundles were made for ttnn-0.65.1rc5-cp310-cp310-manylinux_2_34_x86_64.whl:
Publisher:
package-and-release.yaml on tenstorrent/tt-metal
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
ttnn-0.65.1rc5-cp310-cp310-manylinux_2_34_x86_64.whl -
Subject digest:
d422514f044af0397de495552e39ff197bb30d013fb6d446706a515fade68a8a - Sigstore transparency entry: 773248540
- Sigstore integration time:
-
Permalink:
tenstorrent/tt-metal@0814b6ac4eba0c7fb3cb9a284b7b613720c9873a -
Branch / Tag:
refs/heads/stable - Owner: https://github.com/tenstorrent
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
package-and-release.yaml@0814b6ac4eba0c7fb3cb9a284b7b613720c9873a -
Trigger Event:
workflow_dispatch
-
Statement type: