Skip to main content

XProf Profiler Plugin

Project description

[!IMPORTANT] XProf is hiring! Apply now at https://g.co/jobs/xprof

XProf (+ Tensorboard Profiler Plugin)

An open, scalable, and extensible profiler for the modern ML stack.

AboutInstallationUsageResourcesCiting

Badge to display Apache 2.0 license Badge to display current XProf version Badge to display weekly PyPi downloads

About

XProf offers a number of tools to analyse and visualize the performance of your model across multiple devices. Some of the tools include:

Overview Page

A high-level overview of the performance of your model. This is an aggregated overview for your host and all devices. It includes:
  • Performance summary and breakdown of step times.
  • A graph of individual step times.
  • High level details of the run environment.

Trace Viewer

Displays a timeline of the execution of your model that shows:
  • The duration of each op.
  • Which part of the system (host or device) executed an op.
  • The communication between devices.

Memory Profile

Monitors the memory usage of your model.

Graph Viewer

A visualization of the graph structure of HLOs of your model.

To learn more about the various XProf tools, check out the XProf documentation

[!TIP] New to profiling? Come and check out this Colab Demo.

Installation

To get the most recent release version of XProf, install it via pip:

$ pip install xprof

[!NOTE] For Python 3.12+ users, if you encounter ModuleNotFoundError: No module named 'pkg_resources', install an older version of setuptools:

pip install "setuptools<70"

Alternative installation options:

Installation with Tensorboard
$ pip install xprof tensorboard
Google Cloud

If you use Google Cloud to run your workloads, we recommend the xprofiler tool.It provides a streamlined profile collection and viewing experience using VMs running XProf.

Nightly Releases

Every night, a nightly version of the package is released under the name of xprof-nightly. This package contains the latest changes made by the XProf developers.

To install the nightly version of profiler:

$ pip uninstall xprof tensorboard-plugin-profile
$ pip install xprof-nightly
Build from Source

If the pip packages don't work, you can build XProf from source using Bazel.

1. Set up Bazel

Bazel is the build system used for XProf. Bazelisk is a wrapper for Bazel that simplifies Bazel version management. Download the appropriate .deb package for your system from the Bazelisk releases page and install the downloaded package:

sudo apt install ~/Downloads/bazelisk-amd64.deb

2. Obtain the Repository

Clone the XProf GitHub repository to your local machine:

git clone https://github.com/openxla/xprof.git
cd xprof

3. Build the Project

Build the pip Package: Use Bazel to build the XProf pip package:

bazel run --config=public_cache plugin:build_pip_package

Navigate to the Bazel Output Directory and install:

cd /tmp/profile-pip
pip install .

Usage

[!IMPORTANT] XProf requires access to the Internet to load the Google Chart library. Some charts and tables may be missing if you run XProf entirely offline on your local machine, behind a corporate firewall, or in a datacenter.

Standalone

If you have profile data in a directory (e.g., profiler/demo), you can view it by running:

$ xprof --logdir=profiler/demo --port=6006

Or with the optional command name:

$ xprof server --logdir=profiler/demo --port=6006

With TensorBoard

If you have TensorBoard installed, you can run:

$ tensorboard --logdir=profiler/demo

If you are behind a corporate firewall, you may need to include the --bind_all tensorboard flag.

Go to localhost:6006/#profile of your browser, you should now see the demo overview page show up. Congratulations! You're now ready to capture a profile.

Command-Line Arguments

When launching the XProf server from the command line, you can use the following arguments:

Command Shorthand Default Description
--logdir <path> -l <path> The directory containing XProf profile data (files ending in .xplane.pb). If provided, XProf will load and display profiles from this directory. If omitted, XProf will start without loading any profiles.1
--port <port> -p <port> 8791 The port for the XProf web server.
--grpc_port <port> -gp <port> 50051 The port for the gRPC server used for distributed processing. This must be different from --port.
--worker_service_address <addresses> -wsa <addresses> 0.0.0.0:<grpc_port> A comma-separated list of worker addresses (e.g., host1:50051,host2:50051) for distributed processing.
--hide_capture_profile_button -hcpb N/A If set, hides the 'Capture Profile' button in the UI.

1 You can dynamically load profiles using session_path or run_path URL parameters, as described in the Log Directory Structure section.

Log Directory Structure

When using XProf, profile data must be placed in a specific directory structure. XProf expects .xplane.pb files to be in the following path:

<log_dir>/plugins/profile/<session_name>/
  • <log_dir>: This is the root directory that you supply to tensorboard --logdir.
  • plugins/profile/: This is a required subdirectory.
  • <session_name>/: Each subdirectory inside plugins/profile/ represents a single profiling session. The name of this directory will appear in the TensorBoard UI dropdown to select the session.

Example:

If your log directory is structured like this:

/path/to/your/log_dir/
└── plugins/
    └── profile/
        ├── my_experiment_run_1/
        │   └── host0.xplane.pb
        └── benchmark_20251107/
            └── host1.xplane.pb

You would launch TensorBoard with:

tensorboard --logdir /path/to/your/log_dir/

The runs my_experiment_run_1 and benchmark_20251107 will be available in the "Sessions" tab of the UI.

You can also dynamically load sessions from a GCS bucket or local filesystem by passing URL parameters when loading XProf in your browser. This method works whether or not you provided a logdir at startup and is useful for viewing profiles from various locations without restarting XProf.

For example, if you start XProf with no log directory:

xprof server

You can load sessions using the following URL parameters.

Assume you have profile data stored on GCS or locally, structured like this:

gs://your-bucket/profile_runs/
├── my_experiment_run_1/
│   ├── host0.xplane.pb
│   └── host1.xplane.pb
└── benchmark_20251107/
    └── host0.xplane.pb

There are two URL parameters you can use:

  • session_path: Use this to load a single session directly. The path should point to a directory containing .xplane.pb files for one session.

    • GCS Example: http://localhost:8791/?session_path=gs://your-bucket/profile_runs/my_experiment_run_1
    • Local Path Example: http://localhost:8791/?session_path=/path/to/profile_runs/my_experiment_run_1
    • Result: XProf will load the my_experiment_run_1 session, and you will see its data in the UI.
  • run_path: Use this to point to a directory that contains multiple session directories.

    • GCS Example: http://localhost:8791/?run_path=gs://your-bucket/profile_runs/
    • Local Path Example: http://localhost:8791/?run_path=/path/to/profile_runs/
    • Result: XProf will list all session directories found under run_path (i.e., my_experiment_run_1 and benchmark_20251107) in the "Sessions" dropdown in the UI, allowing you to switch between them.

Loading Precedence

If multiple sources are provided, XProf uses the following order of precedence to determine which profiles to load:

  1. session_path URL parameter
  2. run_path URL parameter
  3. logdir command-line argument

Distributed Profiling

[!WARNING] Currently, distributed processing only benefits the following tools: overview_page, framework_op_stats, input_pipeline, and pod_viewer.

XProf supports distributed profile processing by using an aggregator that distributes work to multiple XProf workers. This is useful for processing large profiles or handling multiple users.

[!NOTE] The ports used in these examples (6006 for the aggregator HTTP server, 9999 for the worker HTTP server, and 50051 for the worker gRPC server) are suggestions and can be customized.

Worker Node

Each worker node should run XProf with a gRPC port exposed so it can receive processing requests. You should also hide the capture button as workers are not meant to be interacted with directly.

$ xprof server --grpc_port=50051 --port=9999 --hide_capture_profile_button

Aggregator Node

The aggregator node runs XProf with the --worker_service_address flag pointing to all available workers. Users will interact with aggregator node's UI.

$ xprof server --worker_service_address=<worker1_ip>:50051,<worker2_ip>:50051 --port=6006 --logdir=profiler/demo

Replace <worker1_ip>, <worker2_ip> with the addresses of your worker machines. Requests sent to the aggregator on port 6006 will be distributed among the workers for processing.

For deploying a distributed XProf setup in a Kubernetes environment, see Kubernetes Deployment Guide.

Resources

Citing XProf

To cite XProf, please use the following BibTeX entry for the MLSys 2026 paper:

@inproceedings{1076558,
  title     = {XProf: An Open, Scalable and Extensible Profiling System for the Modern ML Stack},
  author    = {Robert Hundt and Naveen Kumar and Jose Baiocchi Paredes and Scott Goodson and Clive Verghese and Prasanna Rengasamy and Kelvin Le and Jiya Zhang and Charles Alaras and Yin Zhang and Kan Cai and Jiten Thakkar and Sai Ganesh Bandiatmakuri and Yogesh SY and Ani Udipi and Vikas Aggarwal},
  year      = {2026},
  booktitle = {Ninth Conference on Machine Learning and Systems}
}

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

xprof_nightly-2.23.3a20260618-py3-none-win_amd64.whl (23.0 MB view details)

Uploaded Python 3Windows x86-64

xprof_nightly-2.23.3a20260618-py3-none-manylinux_2_35_aarch64.whl (44.6 MB view details)

Uploaded Python 3manylinux: glibc 2.35+ ARM64

xprof_nightly-2.23.3a20260618-py3-none-manylinux_2_27_x86_64.whl (26.7 MB view details)

Uploaded Python 3manylinux: glibc 2.27+ x86-64

xprof_nightly-2.23.3a20260618-py3-none-macosx_11_0_arm64.whl (38.6 MB view details)

Uploaded Python 3macOS 11.0+ ARM64

File details

Details for the file xprof_nightly-2.23.3a20260618-py3-none-win_amd64.whl.

File metadata

File hashes

Hashes for xprof_nightly-2.23.3a20260618-py3-none-win_amd64.whl
Algorithm Hash digest
SHA256 1b3f5cd35dc7a0355c6b088e3b4aac213532b1ddb16ec97d68c8f9363d1678dd
MD5 03622e20a65bb3ab73efcd3dd9566e96
BLAKE2b-256 69a4bcd332d39623d1e1da1710527c6b44b4dcb41f91615d809302962e1c5f27

See more details on using hashes here.

File details

Details for the file xprof_nightly-2.23.3a20260618-py3-none-manylinux_2_35_aarch64.whl.

File metadata

File hashes

Hashes for xprof_nightly-2.23.3a20260618-py3-none-manylinux_2_35_aarch64.whl
Algorithm Hash digest
SHA256 c9b6ba3dca864b476d42a851645c8e9bb2b61821799901ebc750061d5ac4db30
MD5 59dbca66f56778d60971d280a91732a4
BLAKE2b-256 5c4437a1b5f595e74828df897b7e994cb69b71a7e6a64e307a3077f13fd517d2

See more details on using hashes here.

File details

Details for the file xprof_nightly-2.23.3a20260618-py3-none-manylinux_2_27_x86_64.whl.

File metadata

File hashes

Hashes for xprof_nightly-2.23.3a20260618-py3-none-manylinux_2_27_x86_64.whl
Algorithm Hash digest
SHA256 dc6d8713d7fa2bea56eed2899e79dbed9cb90c6df5b69ff7b5ef2f8e5ecbd8a4
MD5 a0f1c92bbfe265494dc035781dcd6764
BLAKE2b-256 2245a86351a6ea10454d3e786499fb5f0bc2d9578fec1a2c15e9aaba92827b57

See more details on using hashes here.

File details

Details for the file xprof_nightly-2.23.3a20260618-py3-none-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for xprof_nightly-2.23.3a20260618-py3-none-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 fdb2b17b11ecf0a0729b63edc327ee58cab0cf0f4f2c65999c8a14314040f8ca
MD5 ebf198bebc23b8323afe64a27bb00df4
BLAKE2b-256 a8ab13f9fc7576f0761e67a6824eb4615d078f5b8da2d948a1fa674d3f72c9fc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page