High level support for read and visualize job information given by the EAR Library.
Project description
ear-job-visualizer
ear-job-visualizer is a CLI tool written in Python that visualizes runtime data collected by the EAR software. It reads application and loop signatures produced by the EAR and renders them as timeline graphs, making it easy to inspect per-node performance metrics across the duration of a job.
The tool can retrieve data in two ways:
- From the EAR Database, by internally calling the
eacctcommand when only a job ID is provided. - From signature files, by reading application and loop signature CSV files directly. These files can be generated by the EAR CSV report plug-in at job runtime, or exported from the database using
eacct.
This tool is compatible with EAR v6. The major version of the tool must match the major version of EAR you are collecting data from. See the compatibility section if you are working with data from an older EAR version.
The tool currently supports two output formats:
- Static images — heatmap-based timeline graphs showing job runtime metrics per node.
- Paraver traces — job data converted to Paraver Trace Format, for use with Paraver and other BSC Tools.
You can find more information about running jobs with EAR and about Paraver in their respective documentation.
Features
- Generate static images showing runtime metrics of your job monitored by EARL.
- Generate Paraver traces to visualize runtime metrics within Paraver or any other tool from the BSC Tools team.
- Normalize CSV files from older EAR versions to the current format using
ear-normalize-csv.
Requirements
Python package dependencies:
- pandas
- importlib_resources
- rich
- ear_analytics_core
These are installed automatically when using pip. The eacct command is only needed if you intend to let the tool query the EAR Database directly — it is not required when providing signature files manually. See Providing signature files for details.
Installation
Recommended: install from PyPI
The simplest way to install the tool is directly from PyPI:
pip install ear-job-visualization
We recommend installing inside a virtual environment:
python -m venv my_env && source my_env/bin/activate
pip install ear-job-visualization
Install from source
You need the build and setuptools packages to build and install from source. Clone the repository (or download from the latest release), then run:
python -m venv my_env && source my_env/bin/activate
pip install -U pip && pip install build setuptools wheel
python -m build
pip install .
Tool developers may want to use
pip install -e .instead ofpip install .to install in editable mode, avoiding reinstalls on every change.
Making the tool available to other users
On shared HPC systems, you can install the tool to a shared location and expose it via a module file. One approach:
- Export
PYTHONUSERBASEto set a shared installation prefix. - Prepend
<prefix>/lib/python<version>/site-packagestoPYTHONPATH. - Prepend
<prefix>/bintoPATH.
The script create_module.py included in this repository generates an Lmod modulefile automatically. Pass --prefix with the installation prefix (e.g., your virtual environment root); the version is read from the installed package metadata by default:
python create_module.py --prefix /path/to/venv
The script writes the modulefile to ear-job-visualizer/<version>.lua by default. You can override the output path with --output, and the Python version with --python-version if needed. The generated file looks like:
-- -*- lua -*-
-- Lmod modulefile for ear-job-visualizer <version>
whatis("Name: ear-job-visualizer")
whatis("Version: <version>")
whatis("Description: Visualisation tool for performance metrics collected by EAR.")
local prefix = "/path/to/venv"
local python_ver = "<python-version>"
prepend_path("PATH", pathJoin(prefix, "bin"))
prepend_path("PYTHONPATH", pathJoin(prefix, "lib", "python" .. python_ver, "site-packages"))
Place the generated file in a directory on your module path and load it with module load ear-job-visualizer.
EAR version compatibility
ear-job-visualizer v6 requires data from EAR v6. The major version of the tool must always match the major version of EAR you collected data from.
Using v5 CSV files
EAR v6 renamed two columns in the application-level CSV:
| EAR v5 column | EAR v6 column |
|---|---|
START_TIME |
JOB_START_TIME |
END_TIME |
JOB_END_TIME |
To use v5 CSV files without modifying them, export the default configuration and update the app_info section to use the old column names:
ear-job-visualizer --print-config > my_config.json
Edit the app_info section in my_config.json:
"app_info": {
"job_id": "JOBID",
"step_id": "STEPID",
"app_id": "APPID",
"start_time": "START_TIME",
"end_time": "END_TIME",
"node_name": "NODENAME"
}
Then pass it with -c my_config.json when invoking the tool.
Using EAR v4.3 CSV files
EAR v4.3 CSVs differ substantially from the current format: they use different application identifiers, are missing the APPID column, and have different column names for several metrics. Use the ear-normalize-csv command to convert them before visualizing. See the dedicated section below.
Usage
Running ear-job-visualizer without arguments shows:
usage: ear-job-visualizer [-h] [--version] [-c CONFIG_FILE]
(--format {runtime,ear2prv} | --print-config | --avail-metrics)
[--loops-file LOOPS_FILE] [--apps-file APPS_FILE]
[-j JOB_ID] [-s STEP_ID] [-o OUTPUT] [-k] [-t TITLE]
[-r] [-m metric [metric ...]]
ear-job-visualizer: error: one of the arguments --format --print-config --avail-metrics is required
One of --format, --print-config, or --avail-metrics is always required. The most commonly used is --format, but reading this document in order is recommended for new users.
--print-config
Prints the active configuration to stdout. Redirect it to a file to use as a starting point for a custom configuration:
ear-job-visualizer --print-config > my_config.json
--avail-metrics
Lists all metric names recognized by the tool. These are read from the configuration file:
ear-job-visualizer --avail-metrics
To check metrics in a custom configuration:
ear-job-visualizer --avail-metrics -c my_config.json
--format
Requests a plotting or conversion operation. The two available formats are runtime and ear2prv.
--job-id is required with both formats to identify which job to process. --step-id is also required for runtime, and optional for ear2prv — since ear2prv can include multiple steps and applications (e.g., a full workflow) in a single trace.
Querying the EAR Database directly
When no signature files are provided, the tool calls eacct internally to retrieve the job signatures from the EAR Database. Temporary CSV files are created during the process and removed at the end; use --keep-csv to retain them.
Make sure
eacctis on your PATH and theEAR_ETCenvironment variable is set correctly. Loading theearmodule typically handles this. If you encounter issues, contact your system administrator to verify EAR Database access.
ear-job-visualizer --format runtime --job-id <job-id> --step-id <step-id> -m gflops dc_power
Providing signature files
If you already have signature files — or are working on a system without EAR Database access — you can pass them directly with --loops-file and --apps-file. Both options must be provided together.
Signature files can be obtained in two ways:
- Exported from the database:
eacct -j <jobid>[.stepid] -r -c <loops_file>andeacct -j <jobid>[.stepid] -l -c <apps_file>. - Generated automatically at runtime by the EARL CSV report plug-in, by setting
--ear-user-db=<prefix>when submitting your job.
When using the CSV report plug-in, one file per compute node is generated (e.g., <prefix>_<nodename>_loops.csv). For multi-node jobs, you can collect all files into directories and pass those directories to the tool:
mkdir apps_dir && mv *_apps.csv apps_dir
mkdir loops_dir && mv *_loops.csv loops_dir
ear-job-visualizer --format <format> --job-id <job-id> \
--loops-file loops_dir --apps-file apps_dir <format-specific-options>
runtime format
Generates a heatmap-based timeline figure for each metric listed in the --metrics argument. Each figure shows one row per node, all sharing the same time axis, which makes it straightforward to compare behaviour across the cluster.
Requires both
--job-idand--step-id, as it only supports a single job step at a time.
ear-job-visualizer --format runtime --job-id <job-id> --step-id <step-id> -m io_mbs gflops perc_mpi
Use
--avail-metricsto list all supported metric names.
The above command generates the following figures:
GPU metrics
When GPU metrics are requested, the graph shows per-GPU data. GPUs with a constant zero value throughout the execution are automatically filtered out.
ear-job-visualizer --format runtime --job-id 69478 --step-id 0 \
--loops-file /examples/runtime_format/69478_loops.csv \
--apps-file /examples/runtime_format/69478_apps.csv \
-m gpu_util gpu_power -o 69478.0.png
The above command generates the following figures:
Starting from EAR 6.0, signature files can include extended GPU profiling metrics from the NVIDIA® Data Center GPU Manager (DCGM) and the NVIDIA Management Library (NVML) GPM interface, including SM activity, tensor/FP64/FP32/FP16 engine activity, memory bandwidth utilization, NVLink and PCIe bandwidth, and more.
Note: DCGM metrics are not collected by default. They must be explicitly enabled via EAR environment variables before submitting your job. See the Extended GPU metrics section of the EAR documentation for details.
These metrics are available in the default tool configuration once collected. Run --avail-metrics to see the full list.
Colormap range
By default the colormap is scaled to the data range found in the input, across all nodes and GPUs. Pass --manual-range to use the fixed ranges defined for each metric in the configuration file instead.
ear2prv format
Converts EARL signature data to Paraver Trace Format. All metrics present in the input are included in the trace. --step-id is optional, so a single trace can cover multiple steps or an entire workflow.
The mapping to Paraver Trace Format is:
- Node-level data → Paraver task level (Thread 1).
- GPU data → Paraver thread level.
Two example Paraver configuration files are provided to get started with the output.
ear-normalize-csv
This command converts EAR v4.3 CSV files to the format expected by the current version of the tool. EAR v4.3 CSVs differ substantially from the current format — they use different application identifiers, are missing the APPID column, and use different names for several metric columns. The normalizer handles all of these differences automatically.
ear-normalize-csv --apps-file <apps.csv> --loops-file <loops.csv> [--output-dir <dir>]
| Option | Description |
|---|---|
-a / --apps-file |
Path to the EAR v4.3 apps CSV file (required) |
-l / --loops-file |
Path to the EAR v4.3 loops CSV file (required) |
-o / --output-dir |
Directory for the output files (default: current directory) |
The command writes two normalized files — normalized_loops_<original_name> and normalized_apps_<original_name> — and prints a ready-to-use ear-job-visualizer invocation.
Example:
ear-normalize-csv -a job_apps.csv -l job_loops.csv -o normalized/
# Then use the printed command, e.g.:
ear-job-visualizer --format runtime --job-id <job-id> --step-id <step-id> \
--apps-file normalized/normalized_apps_job_apps.csv \
--loops-file normalized/normalized_loops_job_loops.csv \
-m gflops dc_power
Configuration
The tool ships with a default config.json. Export it, modify it, and pass it with -c / --config-file:
ear-job-visualizer --print-config > my_config.json
ear-job-visualizer --format runtime ... -c my_config.json
The configuration file has four top-level sections:
runtime
Controls static image generation. Contains four subsections:
metrics — node-level metrics. Each entry key is the short name used with -m:
"gflops": {
"column_name": "^GFLOPS",
"range": [3, 60],
"step": 1,
"display_name": "CPU GFLOPS"
}
column_name: exact column name or regular expression matching column(s) in the CSV.range:[min, max]used when--manual-rangeis active.step: colormap tick spacing for--manual-range.display_name(optional): label shown in the figure; defaults to the key name.
gpu_metrics — GPU-level metrics, same structure as metrics. The column_name field uses a regex with a GPU index capture group (e.g., GPU(\\d)_POWER_W). Includes standard NVML metrics as well as the extended DCGM/NVML GPM metrics collected by EAR 6.0.
socket_metrics — socket-level metrics (currently: CPU socket temperature). Same structure; column_name uses a socket index capture group.
app_info — maps logical field names to the actual CSV column names. Edit this section to handle CSV files from different EAR versions (see EAR version compatibility):
"app_info": {
"job_id": "JOBID",
"step_id": "STEPID",
"app_id": "APPID",
"start_time": "JOB_START_TIME",
"end_time": "JOB_END_TIME",
"node_name": "NODENAME"
}
ear2prv
Defines which CSV columns are exported to the Paraver trace and their data types. The job subsection covers application-level columns; loop covers loop-level columns. Column names support regular expressions.
events and phases
Define human-readable labels for EARL internal state values (policy states, optimization accuracy, application phases) that appear in the Paraver trace.
Adding metrics not in the default configuration
Not all metrics collected by EAR 6.0 are included in the default config.json. To add one, export the default config, add an entry under runtime.metrics (or runtime.gpu_metrics for GPU columns), and pass the file with -c. The column_name value must match the exact column name or a regex pattern as it appears in the CSV.
For example, to add CPU instructions:
"instructions": {
"column_name": "INSTRUCTIONS",
"range": [0, 1e12],
"step": 1e11,
"display_name": "Instructions"
}
Contact
For any question or suggestion, contact support@ear.energy or open an issue in this repository.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ear_job_visualization-6.0.0.tar.gz.
File metadata
- Download URL: ear_job_visualization-6.0.0.tar.gz
- Upload date:
- Size: 33.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cb62679b0cfc617c51c4bd40bd417016d12950c0414ae10ae71dfcfa8ef84b6a
|
|
| MD5 |
6eb18de4867badb676082356384f5f88
|
|
| BLAKE2b-256 |
e9294f53321a89234f8701bc8ff99c81e0e02e71dae5e00cb859973174fd88a0
|
File details
Details for the file ear_job_visualization-6.0.0-py3-none-any.whl.
File metadata
- Download URL: ear_job_visualization-6.0.0-py3-none-any.whl
- Upload date:
- Size: 28.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7e6e1acf9d430bd0dcf18693830882b9d692a337f6e9f9c242ae1e2b34f354ba
|
|
| MD5 |
c81bd815985d5f50e62cfe0200db5289
|
|
| BLAKE2b-256 |
6b198a7d9395dc9c0b1599fb154035b9f8b1d6387243a635da0e1855bc4b8db7
|