Skip to main content

High level support for read and visualize job information given by the EAR Library.

Project description

ear-job-visualizer

ear-job-visualizer is a CLI tool written in Python that visualizes runtime data collected by the EAR software. It reads application and loop signatures produced by the EAR and renders them as timeline graphs, making it easy to inspect per-node performance metrics across the duration of a job.

The tool can retrieve data in two ways:

  • From the EAR Database, by internally calling the eacct command when only a job ID is provided.
  • From signature files, by reading application and loop signature CSV files directly. These files can be generated by the EAR CSV report plug-in at job runtime, or exported from the database using eacct.

This tool is compatible with EAR v6. The major version of the tool must match the major version of EAR you are collecting data from. See the compatibility section if you are working with data from an older EAR version.

The tool currently supports two output formats:

  1. Static images — heatmap-based timeline graphs showing job runtime metrics per node.
  2. Paraver traces — job data converted to Paraver Trace Format, for use with Paraver and other BSC Tools.

You can find more information about running jobs with EAR and about Paraver in their respective documentation.

Features

  • Generate static images showing runtime metrics of your job monitored by EARL.
  • Generate Paraver traces to visualize runtime metrics within Paraver or any other tool from the BSC Tools team.
  • Normalize CSV files from older EAR versions to the current format using ear-normalize-csv.

Requirements

Python package dependencies:

These are installed automatically when using pip. The eacct command is only needed if you intend to let the tool query the EAR Database directly — it is not required when providing signature files manually. See Providing signature files for details.

Installation

Recommended: install from PyPI

The simplest way to install the tool is directly from PyPI:

pip install ear-job-visualization

We recommend installing inside a virtual environment:

python -m venv my_env && source my_env/bin/activate
pip install ear-job-visualization

Install from source

You need the build and setuptools packages to build and install from source. Clone the repository (or download from the latest release), then run:

python -m venv my_env && source my_env/bin/activate
pip install -U pip && pip install build setuptools wheel
python -m build
pip install .

Tool developers may want to use pip install -e . instead of pip install . to install in editable mode, avoiding reinstalls on every change.

Making the tool available to other users

On shared HPC systems, you can install the tool to a shared location and expose it via a module file. One approach:

  1. Export PYTHONUSERBASE to set a shared installation prefix.
  2. Prepend <prefix>/lib/python<version>/site-packages to PYTHONPATH.
  3. Prepend <prefix>/bin to PATH.

The script create_module.py included in this repository generates an Lmod modulefile automatically. Pass --prefix with the installation prefix (e.g., your virtual environment root); the version is read from the installed package metadata by default:

python create_module.py --prefix /path/to/venv

The script writes the modulefile to ear-job-visualizer/<version>.lua by default. You can override the output path with --output, and the Python version with --python-version if needed. The generated file looks like:

-- -*- lua -*-
-- Lmod modulefile for ear-job-visualizer <version>

whatis("Name:        ear-job-visualizer")
whatis("Version:     <version>")
whatis("Description: Visualisation tool for performance metrics collected by EAR.")

local prefix      = "/path/to/venv"
local python_ver  = "<python-version>"

prepend_path("PATH",       pathJoin(prefix, "bin"))
prepend_path("PYTHONPATH", pathJoin(prefix, "lib", "python" .. python_ver, "site-packages"))

Place the generated file in a directory on your module path and load it with module load ear-job-visualizer.

EAR version compatibility

ear-job-visualizer v6 requires data from EAR v6. The major version of the tool must always match the major version of EAR you collected data from.

Using v5 CSV files

EAR v6 renamed two columns in the application-level CSV:

EAR v5 column EAR v6 column
START_TIME JOB_START_TIME
END_TIME JOB_END_TIME

To use v5 CSV files without modifying them, export the default configuration and update the app_info section to use the old column names:

ear-job-visualizer --print-config > my_config.json

Edit the app_info section in my_config.json:

"app_info": {
    "job_id": "JOBID",
    "step_id": "STEPID",
    "app_id": "APPID",
    "start_time": "START_TIME",
    "end_time": "END_TIME",
    "node_name": "NODENAME"
}

Then pass it with -c my_config.json when invoking the tool.

Using EAR v4.3 CSV files

EAR v4.3 CSVs differ substantially from the current format: they use different application identifiers, are missing the APPID column, and have different column names for several metrics. Use the ear-normalize-csv command to convert them before visualizing. See the dedicated section below.

Usage

Running ear-job-visualizer without arguments shows:

usage: ear-job-visualizer [-h] [--version] [-c CONFIG_FILE]
                          (--format {runtime,ear2prv} | --print-config | --avail-metrics)
                          [--loops-file LOOPS_FILE] [--apps-file APPS_FILE]
                          [-j JOB_ID] [-s STEP_ID] [-o OUTPUT] [-k] [-t TITLE]
                          [-r] [-m metric [metric ...]]
ear-job-visualizer: error: one of the arguments --format --print-config --avail-metrics is required

One of --format, --print-config, or --avail-metrics is always required. The most commonly used is --format, but reading this document in order is recommended for new users.

--print-config

Prints the active configuration to stdout. Redirect it to a file to use as a starting point for a custom configuration:

ear-job-visualizer --print-config > my_config.json

--avail-metrics

Lists all metric names recognized by the tool. These are read from the configuration file:

ear-job-visualizer --avail-metrics

To check metrics in a custom configuration:

ear-job-visualizer --avail-metrics -c my_config.json

--format

Requests a plotting or conversion operation. The two available formats are runtime and ear2prv.

--job-id is required with both formats to identify which job to process. --step-id is also required for runtime, and optional for ear2prv — since ear2prv can include multiple steps and applications (e.g., a full workflow) in a single trace.

Querying the EAR Database directly

When no signature files are provided, the tool calls eacct internally to retrieve the job signatures from the EAR Database. Temporary CSV files are created during the process and removed at the end; use --keep-csv to retain them.

Make sure eacct is on your PATH and the EAR_ETC environment variable is set correctly. Loading the ear module typically handles this. If you encounter issues, contact your system administrator to verify EAR Database access.

ear-job-visualizer --format runtime --job-id <job-id> --step-id <step-id> -m gflops dc_power

Providing signature files

If you already have signature files — or are working on a system without EAR Database access — you can pass them directly with --loops-file and --apps-file. Both options must be provided together.

Signature files can be obtained in two ways:

  • Exported from the database: eacct -j <jobid>[.stepid] -r -c <loops_file> and eacct -j <jobid>[.stepid] -l -c <apps_file>.
  • Generated automatically at runtime by the EARL CSV report plug-in, by setting --ear-user-db=<prefix> when submitting your job.

When using the CSV report plug-in, one file per compute node is generated (e.g., <prefix>_<nodename>_loops.csv). For multi-node jobs, you can collect all files into directories and pass those directories to the tool:

mkdir apps_dir && mv *_apps.csv apps_dir
mkdir loops_dir && mv *_loops.csv loops_dir

ear-job-visualizer --format <format> --job-id <job-id> \
    --loops-file loops_dir --apps-file apps_dir <format-specific-options>

runtime format

Generates a heatmap-based timeline figure for each metric listed in the --metrics argument. Each figure shows one row per node, all sharing the same time axis, which makes it straightforward to compare behaviour across the cluster.

Requires both --job-id and --step-id, as it only supports a single job step at a time.

ear-job-visualizer --format runtime --job-id <job-id> --step-id <step-id> -m io_mbs gflops perc_mpi

Use --avail-metrics to list all supported metric names.

The above command generates the following figures:

An example of OpenRadioss GFLOPS across the execution time.

An example of OpenRadioss I/O rate across the execution time.

An example of OpenRadioss %MPI rate across the execution time.

GPU metrics

When GPU metrics are requested, the graph shows per-GPU data. GPUs with a constant zero value throughout the execution are automatically filtered out.

ear-job-visualizer --format runtime --job-id 69478 --step-id 0 \
    --loops-file /examples/runtime_format/69478_loops.csv \
    --apps-file /examples/runtime_format/69478_apps.csv \
    -m gpu_util gpu_power -o 69478.0.png

The above command generates the following figures:

An example of GPU utilization of a single node application using just one GPU device.

An example of GPU power consumption of a single node application using just one GPU device.

Starting from EAR 6.0, signature files can include extended GPU profiling metrics from the NVIDIA® Data Center GPU Manager (DCGM) and the NVIDIA Management Library (NVML) GPM interface, including SM activity, tensor/FP64/FP32/FP16 engine activity, memory bandwidth utilization, NVLink and PCIe bandwidth, and more.

Note: DCGM metrics are not collected by default. They must be explicitly enabled via EAR environment variables before submitting your job. See the Extended GPU metrics section of the EAR documentation for details.

These metrics are available in the default tool configuration once collected. Run --avail-metrics to see the full list.

Colormap range

By default the colormap is scaled to the data range found in the input, across all nodes and GPUs. Pass --manual-range to use the fixed ranges defined for each metric in the configuration file instead.

ear2prv format

Converts EARL signature data to Paraver Trace Format. All metrics present in the input are included in the trace. --step-id is optional, so a single trace can cover multiple steps or an entire workflow.

The mapping to Paraver Trace Format is:

  • Node-level data → Paraver task level (Thread 1).
  • GPU data → Paraver thread level.

Two example Paraver configuration files are provided to get started with the output.

ear-normalize-csv

This command converts EAR v4.3 CSV files to the format expected by the current version of the tool. EAR v4.3 CSVs differ substantially from the current format — they use different application identifiers, are missing the APPID column, and use different names for several metric columns. The normalizer handles all of these differences automatically.

ear-normalize-csv --apps-file <apps.csv> --loops-file <loops.csv> [--output-dir <dir>]
Option Description
-a / --apps-file Path to the EAR v4.3 apps CSV file (required)
-l / --loops-file Path to the EAR v4.3 loops CSV file (required)
-o / --output-dir Directory for the output files (default: current directory)

The command writes two normalized files — normalized_loops_<original_name> and normalized_apps_<original_name> — and prints a ready-to-use ear-job-visualizer invocation.

Example:

ear-normalize-csv -a job_apps.csv -l job_loops.csv -o normalized/

# Then use the printed command, e.g.:
ear-job-visualizer --format runtime --job-id <job-id> --step-id <step-id> \
    --apps-file normalized/normalized_apps_job_apps.csv \
    --loops-file normalized/normalized_loops_job_loops.csv \
    -m gflops dc_power

Configuration

The tool ships with a default config.json. Export it, modify it, and pass it with -c / --config-file:

ear-job-visualizer --print-config > my_config.json
ear-job-visualizer --format runtime ... -c my_config.json

The configuration file has four top-level sections:

runtime

Controls static image generation. Contains four subsections:

metrics — node-level metrics. Each entry key is the short name used with -m:

"gflops": {
    "column_name": "^GFLOPS",
    "range": [3, 60],
    "step": 1,
    "display_name": "CPU GFLOPS"
}
  • column_name: exact column name or regular expression matching column(s) in the CSV.
  • range: [min, max] used when --manual-range is active.
  • step: colormap tick spacing for --manual-range.
  • display_name (optional): label shown in the figure; defaults to the key name.

gpu_metrics — GPU-level metrics, same structure as metrics. The column_name field uses a regex with a GPU index capture group (e.g., GPU(\\d)_POWER_W). Includes standard NVML metrics as well as the extended DCGM/NVML GPM metrics collected by EAR 6.0.

socket_metrics — socket-level metrics (currently: CPU socket temperature). Same structure; column_name uses a socket index capture group.

app_info — maps logical field names to the actual CSV column names. Edit this section to handle CSV files from different EAR versions (see EAR version compatibility):

"app_info": {
    "job_id": "JOBID",
    "step_id": "STEPID",
    "app_id": "APPID",
    "start_time": "JOB_START_TIME",
    "end_time": "JOB_END_TIME",
    "node_name": "NODENAME"
}

ear2prv

Defines which CSV columns are exported to the Paraver trace and their data types. The job subsection covers application-level columns; loop covers loop-level columns. Column names support regular expressions.

events and phases

Define human-readable labels for EARL internal state values (policy states, optimization accuracy, application phases) that appear in the Paraver trace.

Adding metrics not in the default configuration

Not all metrics collected by EAR 6.0 are included in the default config.json. To add one, export the default config, add an entry under runtime.metrics (or runtime.gpu_metrics for GPU columns), and pass the file with -c. The column_name value must match the exact column name or a regex pattern as it appears in the CSV.

For example, to add CPU instructions:

"instructions": {
    "column_name": "INSTRUCTIONS",
    "range": [0, 1e12],
    "step": 1e11,
    "display_name": "Instructions"
}

Contact

For any question or suggestion, contact support@ear.energy or open an issue in this repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ear_job_visualization-6.0.0.tar.gz (33.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ear_job_visualization-6.0.0-py3-none-any.whl (28.4 kB view details)

Uploaded Python 3

File details

Details for the file ear_job_visualization-6.0.0.tar.gz.

File metadata

  • Download URL: ear_job_visualization-6.0.0.tar.gz
  • Upload date:
  • Size: 33.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for ear_job_visualization-6.0.0.tar.gz
Algorithm Hash digest
SHA256 cb62679b0cfc617c51c4bd40bd417016d12950c0414ae10ae71dfcfa8ef84b6a
MD5 6eb18de4867badb676082356384f5f88
BLAKE2b-256 e9294f53321a89234f8701bc8ff99c81e0e02e71dae5e00cb859973174fd88a0

See more details on using hashes here.

File details

Details for the file ear_job_visualization-6.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for ear_job_visualization-6.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7e6e1acf9d430bd0dcf18693830882b9d692a337f6e9f9c242ae1e2b34f354ba
MD5 c81bd815985d5f50e62cfe0200db5289
BLAKE2b-256 6b198a7d9395dc9c0b1599fb154035b9f8b1d6387243a635da0e1855bc4b8db7

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page