ethernet topology configuration tool for Tenstorrent silicon
Project description
TT-Topology
Tenstorrent Topology (TT-Topology) is a command line utility used to flash multiple NB cards on a system to use specific eth routing configurations.
It curretly supports three configurtions - mesh, linear and torus
Official Repository
https://github.com/tenstorrent/tt-topology/
Warning
tt-topology is not applicable on the following:
- BH pcie cards
- WH 6U Galaxy systems
- BH 6U Galaxy systems The tool will error out if used with unsupported baords
Getting started
Build and editing instruction are as follows -
Building from Git
Install and source rust for the luwen library
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source "$HOME/.cargo/env"
Optional
Generate and source a python environment. This is useful not only to isolate your environment, but potentially easier to debug and use. This environment can be shared if you want to use a single environment for all your Tenstorrent tools
python3 -m venv .venv
source .venv/bin/activate
Required
Install tt-topology - clone the repo, enter the folder and pip install
git clone https://github.com/tenstorrent/tt-topology.git
cd tt-topology
pip3 install --upgrade pip
pip3 install .
Optional - for TT-Topology developers
Generate and source a python3 environment
python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install pre-commit
For users who would like to edit the code without re-building, install SMI in editable mode.
pip install --editable .
Recommended: install the pre-commit hooks so there is auto formatting for all files on committing.
pre-commit install
Usage
Command line arguments
usage: tt-topology [-h] [-v] [-l {linear,torus,mesh,isolated}] [-o] [-f [filename]] [-g] [-ls] [--log [log]] [-p [plot]] [-r [config.json ...]]
Tenstorrent Topology (TT-Topology) is a command line utility to flash ethernet coordinates when multiple NB's are connected together.
options:
-h, --help show this help message and exit
-v, --version show program's version number and exit
-l {linear,torus,mesh,isolated}, --layout {linear,torus,mesh,isolated}
Select the layout (linear, torus, mesh, isolated). Default is linear.
-o, --octopus
-f [filename], --filename [filename]
Change filename for test log. Default: ~/tt_smi/<timestamp>_snapshot.json
-g, --generate_reset_json
Generate default reset json file that reset consumes. Update the generated file and use it as an input for the --reset option
-ls, --list List out all the boards on host with their coordinates and layout.
--log [log] Change filename for the topology flash log. Default: ~/tt_topology_logs/<timestamp>_log.json
-p [plot], --plot_filename [plot]
Change the plot of the png that will have the graph layout of the chips. Default: chip_layout.png
-r [config.json ...], --reset [config.json ...]
Provide a valid reset JSON
TT-Topology Procedure
TT-Topology does the following when calculating and flashing the coordinates -
- Flash all the boards to default - set all eth port disables to 0 and reset coordinates to (0,0) for local chips and (1,0) for n300 remote chips.
- Issue a board level reset to apply the new flash to the chips.
- Generate a mapping of all possible connections and their type between the available chips.
- Using a graph algorithm generate coordinates for each chip based on user input. These layouts are discussed in detail in the sections below.
- Write the new coordinates to the chips.
- Issue a board level reset to apply the new flash to the chips.
- Return a png with a graphic representation of the layout and a .json log file with details of the above steps.
Chip layouts
TT-topology can be used to flash one of the three chip layouts - mesh, linear and torus.
Mesh
In the mesh layout is a trivalent graph where each node can have a max of 3 connection. A BFS algorithm is used to assign the coordinates. Command to generate a mesh layout
$ tt-topology -l mesh -p mesh_layout.png
For a host with 2 n300 cards and 4 n300 cards, the command will generate a layouts that look as follows -
Linear
The linear layout, as the name suggests is a layout where all chips are connected by a single line. The coordinates are assigned by finding a cycle in the graph and then assigning coordinates in order. Command to generate a linear layout
$ tt-topology -l linear -f linear_layout.png
For a host with 2 n300 cards and 4 n300 cards, the command will generate a layouts that look as follows -
Torus
The torus layout is a cyclic graph where all chips have a single line connecting all nodes. The coordinates are assigned by finding a cycle in the graph and then assigning coordinates in order. Command to generate a torus layout
$ tt-topology -l torus -p torus_layout.png
For a host with four n300 cards, the command will generate a layout that looks as follows
Octopus(TGG/TG) Support in TT-Topology
- TGG setting: 8 n150s connected to 2 galaxies
- TG setting: 4 n150s connected to 1 galaxy
Usage
-
Generate a default mobo reset json file saved at
~/.config/tenstorrent/reset_config.jsonby running the following command$ tt-topology -g -
Fill in "mobo", "credo", and "disabled_ports" under "wh_mobo_reset"
Here is an example of what your reset_config.json file may look like:
{ "time": "2024-03-06T20:12:27.640859", "host_name": "yyz-lab-212", "gs_tensix_reset": { "pci_index": [] }, "wh_link_reset": { "pci_index": [ 0, 1, 2, 3 ] }, "re_init_devices": true, "wh_mobo_reset": [ { "nb_host_pci_idx": [ 0, 1, 2, 3 ], "mobo": "mobo-ce-44", "credo": [ "6:0", "6:1", "7:0", "7:1" ], "disabled_ports": [ "0:2", "1:2", "6:2", "7:2" ] } ] } -
Flashing multiple NB cards to use specific eth routing configurations by running the following command
$ tt-topology -o -r ~/.config/tenstorrent/reset_config.json
Internal Procedure
- Setup
mobo_eth_enon every local n150 to train with the Galaxy - Program the shelf/rack of the Galaxies
- Program all local n150s to rack 0, shelf 0, x 0, y 0
- Reset with the following
retimer_selanddisable_seland wait for trainingretimer_sel: From thecredofield of the reset json file for the specific Galaxydisable_sel: All the other ports not specified by theretimer_sel
- Check QSFP link and change shelf number for each n150 according to the shelf on the connected Galaxy
- Program the x, y coords of the local n150s based on the other side of the link
- Reset again with the
retimer_selanddisable_seland wait for training, and verify all chips show upretimer_sel: From thecredofield of the reset json file for the specific Galaxydisable_sel: From thedisabled_portsfield of the reset json file for the specific Galaxy
Logging
TT-Topology records the pre and post flash relevant SPI registers, connection map and coordinates of the chips in a .json file for record keeping and debugging.
By default it is stored at ~/tt_topology_logs/<timestamp>_log.json. This can be changed by using the log command line argument as follows
$ tt-topology -log new_log.json ...
License
Apache 2.0 - https://www.apache.org/licenses/LICENSE-2.0.txt
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file tt_topology-1.2.13.tar.gz.
File metadata
- Download URL: tt_topology-1.2.13.tar.gz
- Upload date:
- Size: 105.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
db1aa795253a98df9768fabb23efa465f14a6024e80b0ea4290f0b23b857913f
|
|
| MD5 |
116c2ad7280e1870a0373c63eb30f193
|
|
| BLAKE2b-256 |
5e970c9bfaf69589edcc9da75f367451c9705257a4dfc1c54fe6871f4764b61f
|
Provenance
The following attestation bundles were made for tt_topology-1.2.13.tar.gz:
Publisher:
release.yml on tenstorrent/tt-topology
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tt_topology-1.2.13.tar.gz -
Subject digest:
db1aa795253a98df9768fabb23efa465f14a6024e80b0ea4290f0b23b857913f - Sigstore transparency entry: 408871996
- Sigstore integration time:
-
Permalink:
tenstorrent/tt-topology@5f3c0779d692fc5ef661ac52c66cc0a7f0193af6 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/tenstorrent
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@5f3c0779d692fc5ef661ac52c66cc0a7f0193af6 -
Trigger Event:
workflow_dispatch
-
Statement type:
File details
Details for the file tt_topology-1.2.13-py3-none-any.whl.
File metadata
- Download URL: tt_topology-1.2.13-py3-none-any.whl
- Upload date:
- Size: 100.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c20971d74a6106cabd36fc4265d50d8faf0c5f9bde990dd6200c312c44bb7003
|
|
| MD5 |
6f33edbc4c7027b56a83836dab1c40b6
|
|
| BLAKE2b-256 |
a5538842640ea52047a12acc20ee67a5e2b8ac13f0ec8665febf6f544f3eb185
|
Provenance
The following attestation bundles were made for tt_topology-1.2.13-py3-none-any.whl:
Publisher:
release.yml on tenstorrent/tt-topology
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
tt_topology-1.2.13-py3-none-any.whl -
Subject digest:
c20971d74a6106cabd36fc4265d50d8faf0c5f9bde990dd6200c312c44bb7003 - Sigstore transparency entry: 408872012
- Sigstore integration time:
-
Permalink:
tenstorrent/tt-topology@5f3c0779d692fc5ef661ac52c66cc0a7f0193af6 -
Branch / Tag:
refs/heads/main - Owner: https://github.com/tenstorrent
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@5f3c0779d692fc5ef661ac52c66cc0a7f0193af6 -
Trigger Event:
workflow_dispatch
-
Statement type: