A Python CLI wrapper for profiling Tenstorrent's TT-Metal tests
Project description
🚀 ttperf - TT-Metal Performance Profiler
A streamlined CLI tool for profiling Tenstorrent's TT-Metal tests and extracting device kernel performance metrics
✨ Features
- 🔍 Automated Profiling: Seamlessly runs Tenstorrent's TT-Metal profiler with pytest
- 📊 CSV Analysis: Automatically extracts and parses performance CSV files
- ⚡ Real-time Output: Shows profiling progress in real-time
- 📈 Performance Metrics: Calculates total DEVICE KERNEL DURATION
- 🎯 Simple CLI: Easy-to-use command-line interface
- 🛠️ Flexible: Supports named profiles and various test paths
- 🚀 Operation-based Profiling: Profile specific operations by name (e.g.,
ttperf add) - ⚙️ Dynamic Configuration: Customize tensor shape, dtype, and layout for operations
🚀 Quick Start
Installation
# Install from PyPI (recommended)
pip install ttperf
Or install from source:
# Clone the repository
git clone https://github.com/Aswintechie/ttperf.git
cd ttperf
# Install the package
pip install -e .
Basic Usage
# Run profiling on a specific test
ttperf test_performance.py
# Run with a custom profile name
ttperf my_profile pytest test_performance.py
# Run on a specific test method
ttperf tests/test_ops.py::test_matmul
# Profile specific operations by name
ttperf add
ttperf relu
ttperf matmul
# Profile operations with custom profile names
ttperf my_add_profile add
ttperf my_relu_profile relu
# Profile operations with custom configuration
ttperf add --shape 1,1,32,32 --dtype bfloat16 --layout tile
ttperf relu --shape 1,1,64,64 --dtype float32 --layout row_major
# Profile operations with memory configuration
ttperf add --dram # Use DRAM memory (default)
ttperf relu --l1 # Use L1 memory
ttperf add --shape 1,1,64,64 --l1 # Combined options
📋 Usage Examples
Test File Profiling
ttperf test_conv.py
Named Profile
ttperf conv_benchmark pytest test_conv.py
Specific Test Method
ttperf tests/ops/test_matmul.py::test_basic_matmul
Operation-based Profiling
# Basic operations
ttperf add
ttperf subtract
ttperf multiply
ttperf divide
# Activation functions
ttperf relu
ttperf sigmoid
ttperf tanh
ttperf gelu
# Mathematical operations
ttperf sqrt
ttperf exp
ttperf log
ttperf sin
ttperf cos
# Comparison operations
ttperf gt
ttperf lt
ttperf eq
ttperf ne
# Reduction operations
ttperf max
ttperf min
ttperf mean
ttperf sum
# Backward operations
ttperf add_bw
ttperf relu_bw
ttperf sigmoid_bw
Dynamic Configuration
# Custom tensor shape
ttperf add --shape 1,1,32,32
ttperf relu --shape 2,3,64,128
# Custom data type
ttperf add --dtype float32
ttperf multiply --dtype int32
# Custom memory layout
ttperf add --layout row_major
ttperf relu --layout tile
# Combined configuration
ttperf add --shape 1,1,64,64 --dtype float32 --layout row_major
ttperf gelu --shape 2,1,32,32 --dtype bfloat16 --layout tile
# Memory configuration options
ttperf add --memory-config dram # Explicit DRAM
ttperf relu --memory-config l1 # Explicit L1
ttperf add --dram --shape 1,1,128,128 # DRAM with custom shape
ttperf relu --l1 --dtype float32 # L1 with custom dtype
List All Supported Operations
ttperf --list-ops
# or
ttperf -l
Output Example
🔧 Using custom configuration:
Shape: (1, 1, 32, 32)
Dtype: bfloat16
Layout: tile
🏷️ Auto-generated profile name: temp_test_add
▶️ Running: ./tools/tracy/profile_this.py -n temp_test_add -c "pytest temp_test_add.py"
... (profiling output) ...
📁 Found CSV path: /path/to/profile_results.csv
⏱️ DEVICE KERNEL DURATION [ns] total: 1234567.89 ns
🛠️ How It Works
- Command Parsing: Analyzes input arguments to determine profile name and test path/operation
- Operation Detection: If an operation name is provided, maps it to the corresponding test method
- Dynamic Configuration: If custom configuration is provided, generates a temporary test file with the specified parameters
- Profile Execution: Runs the Tenstorrent's TT-Metal profiler with the specified test
- Output Monitoring: Streams profiling output in real-time
- CSV Extraction: Parses the output to find the generated CSV file path
- Performance Analysis: Reads the CSV and calculates total device kernel duration
📊 Performance Metrics
The tool extracts the following key metrics:
- DEVICE KERNEL DURATION [ns]: Total time spent in device kernels
- CSV Path: Location of the detailed profiling results
- Real-time Progress: Live output during profiling
⚙️ Configuration Options
Shape Configuration
- Format: Comma-separated integers (e.g.,
1,1,32,32) - Default:
(1, 1, 1024, 1024) - Example:
--shape 2,3,64,128
Data Type Configuration
- Valid Options:
bfloat16,float32,int32 - Default:
bfloat16 - Example:
--dtype float32
Layout Configuration
- Valid Options:
tile,row_major - Default:
tile - Example:
--layout row_major
🔧 Requirements
- Python 3.8+
- pandas
- Tenstorrent's TT-Metal development environment
- pytest
📁 Project Structure
ttperf/
├── ttperf.py # Main CLI implementation
├── pyproject.toml # Project configuration
├── README.md # This file
└── .gitignore # Git ignore rules
🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
📝 License
This project is licensed under the MIT License - see the LICENSE file for details.
⚠️ Disclaimer
This tool is an independent utility that interfaces with Tenstorrent's TT-Metal profiling tools. It is not affiliated with or endorsed by Tenstorrent Inc. The tool serves as a convenience wrapper around existing TT-Metal profiling infrastructure.
🐛 Issues
If you encounter any issues, please create an issue on GitHub.
👨💻 Author
Aswin Z
- GitHub: @Aswintechie
- Portfolio: aswinlocal.in
🌟 Acknowledgments
- Tenstorrent's TT-Metal development team for the profiling tools
- Python community for excellent libraries like pandas
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ttperf-0.1.6.tar.gz.
File metadata
- Download URL: ttperf-0.1.6.tar.gz
- Upload date:
- Size: 28.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
644fbcc8d9da275ee2589be66c101ffe8e8c39af637fadd78b62f6cae2a2b655
|
|
| MD5 |
eb88283cd5f9c2d95c586a117814a41f
|
|
| BLAKE2b-256 |
c0100629bd98f230c40c9395b5e6f70ada2960d86f13e89b5f7cdd227e0c3bc4
|
File details
Details for the file ttperf-0.1.6-py3-none-any.whl.
File metadata
- Download URL: ttperf-0.1.6-py3-none-any.whl
- Upload date:
- Size: 26.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.18
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d47d7e372bfead6dc51bc3dabcacbff9991399cd0816245d43be3670f5a7568e
|
|
| MD5 |
a87bb58cc501926c46191ea45193225e
|
|
| BLAKE2b-256 |
c3c0ec9babeaf2142a31c9d9f3f0be75fc71990c316ec293552c7df790c864ad
|