tensorboard-reducer

Reduce multiple TensorBoard runs to new event (or CSV) files

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

TensorBoard Reducer

This project was inspired by tensorboard-aggregator (similar project built for TensorFlow rather than PyTorch) and this SO answer.

Compute reduced statistics (mean, std, min, max, median or any other numpy operation) of multiple TensorBoard runs matching a directory glob pattern. This can e.g. be used when training multiple identical models to reduce the noise in their loss/accuracy/error curves to establish statistical significance in performance improvements. The aggregation results can be saved to disk either as new TensorBoard event files or as CSV.

Requires PyTorch and TensorBoard. No TensorFlow installation required.

Installation

pip install tensorboard-reducer

Usage

Through CLI

tb-reducer -i 'glob_pattern/of_dirs_to_reduce*' -o output_dir -r mean,std,min,max

Note: By default, TensorBoard Reducer expects event files containing identical tags and equal number of steps for all scalars. If e.g. you trained one model for 300 epochs and another for 400 and/or added different tags, see flags --lax-tags and --lax-tags to remove this restriction.

Mean of 3 TensorBoard logs

tb-reducer has the following flags:

-i/--indirs-glob (required): Glob pattern of the run directories to reduce.
-o/--outdir (required): Name of the directory to save the new reduced run data. If --format is tb-events, a separate directory will be created for each reduce op (mean, std, ...) suffixed by the op's name (outdir-mean, outdir-std, ...). If --format is csv, a single file will created and outdir must be the full file path ending in .csv.
-f/--format (optional, default: tb-events): Output format of reduced TensorBoard runs. Use tb-events for writing regular TensorBoard event files or csv. If csv, -o/--outdir must have .csv extension and all reduction ops will be written to a single CSV file rather than separate directories for each reduce op. Use pandas.read_csv("path/to/file.csv", header=[0, 1], index_col=0) to read data back into memory as a multi-index dataframe.
-r/--reduce-ops (optional, default: mean): Comma-separated names of numpy reduction ops (mean, std, min, max, ...). Default is mean. Each reduction is written to a separate outdir suffixed by its op name, e.g. if outdir='my-new-run, the mean reduction will be written to my-new-run-mean.
-w/--overwrite (optional, default: False): Whether to overwrite existing output directories/CSV files.
--lax-tags (optional, default: False): Allow different runs have to different sets of tags. In this mode, each tag reduction will run over as many runs as are available for a given tag, even if that's just one. Proceed with caution as not all tags will have the same statistics in downstream analysis.
--lax-steps (optional, default: False): Allow tags across different runs to have unequal numbers of steps. In this mode, each reduction will only use as many steps as are available in the shortest run (same behavior as zip(short_list, long_list))."

Through Python API

You can also import tensorboard_reducer into a Python script for more complex operations. A simple example that makes use of the full Python API (load_tb_events, reduce_events, write_csv, write_tb_events) to get you started:

from tensorboard_reducer import load_tb_events, reduce_events, write_csv, write_tb_events

in_dirs_glob = "glob_pattern/of_directories_to_reduce*"
out_dir = "path/to/output_dir"
out_csv = "path/to/out.csv"
overwrite = False
reduce_ops = ["mean", "min", "max"]

events_dict = load_tb_events(in_dirs_glob)

n_steps, n_events = list(events_dict.values())[0].shape
n_scalars = len(events_dict)

print(
    f"Loaded {n_events} TensorBoard runs with {n_scalars} scalars and {n_steps} steps each"
)
for tag in events_dict.keys():
    print(f" - {tag}")

reduced_events = reduce_events(events_dict, reduce_ops)

for op in reduce_ops:
    print(f"Writing '{op}' reduction to '{out_dir}-{op}'")

write_tb_events(reduced_events, out_dir, overwrite)

write_csv(reduced_events, out_csv, overwrite)

Project details

These details have not been verified by PyPI

Project links

Homepage

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.3.1

Sep 21, 2023

0.3.0

Feb 20, 2023

0.2.10

Sep 4, 2022

0.2.9

Jul 6, 2022

0.2.8

Jun 29, 2022

0.2.7

Mar 16, 2022

0.2.5

Jan 24, 2022

0.2.4

Jan 8, 2022

0.2.3

Jun 9, 2021

0.2.2

Jun 6, 2021

0.2.1

Jun 1, 2021

This version

0.2.0

Jun 1, 2021

0.1.6

Apr 23, 2021

0.1.5

Apr 21, 2021

0.1.4

Apr 12, 2021

0.1.3

Apr 5, 2021

0.1.2

Apr 5, 2021

0.1.1

Apr 4, 2021

0.1.0

Apr 4, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tensorboard-reducer-0.2.0.tar.gz (12.6 kB view hashes)

Uploaded Jun 1, 2021 Source

Built Distribution

tensorboard_reducer-0.2.0-py2.py3-none-any.whl (12.6 kB view hashes)

Uploaded Jun 1, 2021 Python 2 Python 3

Hashes for tensorboard-reducer-0.2.0.tar.gz

Hashes for tensorboard-reducer-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`f6e16047f67fe4d9439b77a1bf113c7410574644acc5db5681ab17464067a5e7`
MD5	`ee4f8fd0d278c6e83636cd8cb5745dc7`
BLAKE2b-256	`db927b52dbc3dcd452025e571ec5a904421d1155e5daa3720380354b5ffa274e`

Hashes for tensorboard_reducer-0.2.0-py2.py3-none-any.whl

Hashes for tensorboard_reducer-0.2.0-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`abdf54345359f0f53e82d2b7b34b07f92582264d3e277516f0752f71511f5366`
MD5	`9c442565b7e21221f19928f8cf3443f8`
BLAKE2b-256	`364b7c025cdb9111b63f46abcaee8b2a7e26d7d4e2d940cec9f19165bf25368a`