Skip to main content

I/O profiler for deep learning python apps. Specifically for dlio_benchmark.

Project description

DFTracer

Build and Test Documentation Status PyPI - Version PyPI - Wheel PyPI - Python Version PyPI - License

Overview

DFTracer is a tracing tool designed to capture both application-code and I/O-call level events from workflows. It provides a unified tracing interface, optimized trace format, and compression mechanism to enable efficient distributed analysis for large-scale AI-driven workloads.

Prerequisites

Requirements for DFTracer

  1. Python>=3.7
  2. pybind11

Requirements for DFAnalyzer

  1. bokeh>=2.4.2
  2. dask>=2023.5.0
  3. distributed
  4. matplotlib>=3.7.3
  5. numpy>=1.24.3
  6. pandas>=2.0.3
  7. pyarrow>=12.0.1
  8. pybind11
  9. python-intervals>=1.10.0.post1
  10. rich>=13.6.0
  11. seaborn>=0.13.2
  12. zindex_py

Installation

Users can easily install DFTracer using pip, the standard tool for installing Python packages. This method works for both native Python and Conda environments.

From PyPI

pip install dftracer
pip install dftracer[dfanalyzer]

From Github

DFTRACER_VERSION=develop
pip install git+https://github.com/LLNL/dftracer.git@${DFTRACER_VERSION}
pip install git+https://github.com/LLNL/dftracer.git@${DFTRACER_VERSION}#egg=dftracer[dfanalyzer]

From Source

git clone git@github.com:LLNL/dftracer.git
cd dftracer
# You can skip this for installing the dev branch.
# for latest stable version use master branch.
git checkout tags/<Release> -b <Release>
pip install .

For detailed build instructions, click here.

Usage

from dftracer.logger import dftracer, dft_fn
log_inst = dftracer.initialize_log(logfile=None, data_dir=None, process_id=-1)
dft_fn = dft_fn("COMPUTE")

# Example of using function decorators
@dft_fn.log
def log_events(index):
    sleep(1)

# Example of function spawning and implicit I/O calls
def posix_calls(val):
    index, is_spawn = val
    path = f"{cwd}/data/demofile{index}.txt"
    f = open(path, "w+")
    f.write("Now the file has more content!")
    f.close()
    if is_spawn:
        print(f"Calling spawn on {index} with pid {os.getpid()}")
        log_inst.finalize() # This need to be called to correctly finalize DFTracer.
    else:
        print(f"Not calling spawn on {index} with pid {os.getpid()}")

# NPZ calls internally calls POSIX calls.
def npz_calls(index):
    path = f"{cwd}/data/demofile{index}.npz"
    if os.path.exists(path):
        os.remove(path)
    records = np.random.randint(255, size=(8, 8, 1024), dtype=np.uint8)
    record_labels = [0] * 1024
    np.savez(path, x=records, y=record_labels)

def main():
    log_events(0)
    npz_calls(1)
    with get_context('spawn').Pool(1, initializer=init) as pool:
        pool.map(posix_calls, ((2, True),))
    log_inst.finalize()

if __name__ == "__main__":
    main()

For this example, as the dftracer.initialize_log do not pass logfile or data_dir, we need to set DFTRACER_LOG_FILE and DFTRACER_DATA_DIR. By default the DFTracer mode is set to FUNCTION. Example of running this configurations are:

# The process id, app_name and .pfw will be appended by DFTracer for each app and process.
# The name of the final log file will be ~/log_file-<APP_NAME>-<PID>.pfw
DFTRACER_LOG_FILE=~/log_file
# Colon separated paths to include in the tracing
DFTRACER_DATA_DIR=/dev/shm/:/p/gpfs1/$USER/dataset:$PWD/data
# Enable DFTracer
DFTRACER_ENABLE=1

For more examples, click here.

Documentation

Citation and Reference

The original SC'24 paper describes the design and implementation of the DFTracer code. Please cite this paper and the code if you use DFTracer in your research.

@inproceedings{devarajan_dftracer_2024,
    address = {Atlanta, GA},
    title = {{DFTracer}: {An} {Analysis}-{Friendly} {Data} {Flow} {Tracer} for {AI}-{Driven} {Workflows}},
    shorttitle = {{DFTracer}},
    urldate = {2024-07-31},
    booktitle = {{SC24}: {International} {Conference} for {High} {Performance} {Computing}, {Networking}, {Storage} and {Analysis}},
    publisher = {IEEE},
    author = {Devarajan, Hariharan and Pottier, Loic and Velusamy, Kaushik and Zheng, Huihuo and Yildirim, Izzet and Kogiou, Olga and Yu, Weikuan and Kougkas, Anthony and Sun, Xian-He and Yeom, Jae Seung and Mohror, Kathryn},
    month = nov,
    year = {2024},
}

@misc{devarajan_dftracer_code_2024,
    type = {Github},
    title = {Github {DFTracer}},
    shorttitle = {{DFTracer}},
    url = {https://github.com/LLNL/dftracer.git},
    urldate = {2024-07-31},
    journal = {DFTracer: A multi-level dataflow tracer for capture I/O calls from worklows.},
    author = {Devarajan, Hariharan and Pottier, Loic and Velusamy, Kaushik and Zheng, Huihuo and Yildirim, Izzet and Kogiou, Olga and Yu, Weikuan and Kougkas, Anthony and Sun, Xian-He and Yeom, Jae Seung and Mohror, Kathryn},
    month = jun,
    year = {2024},
}

Acknowledgments

This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344; and under the auspices of the National Cancer Institute (NCI) by Frederick National Laboratory for Cancer Research (FNLCR) under Contract 75N91019D00024. This research used resources of the Argonne Leadership Computing Facility, a U.S. Department of Energy (DOE) Office of Science user facility at Argonne National Laboratory and is based on research supported by the U.S. DOE Office of Science-Advanced Scientific Computing Research Program, under Contract No. DE-AC02-06CH11357. Office of Advanced Scientific Computing Research under the DOE Early Career Research Program. Also, This material is based upon work partially supported by LLNL LDRD 23-ERD-045 and 24-SI-005. LLNL-CONF-857447.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dftracer-1.0.15.tar.gz (13.2 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dftracer-1.0.15-cp312-cp312-manylinux_2_39_x86_64.whl (8.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.39+ x86-64

dftracer-1.0.15-cp311-cp311-manylinux_2_39_x86_64.whl (8.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.39+ x86-64

dftracer-1.0.15-cp310-cp310-manylinux_2_39_x86_64.whl (8.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.39+ x86-64

File details

Details for the file dftracer-1.0.15.tar.gz.

File metadata

  • Download URL: dftracer-1.0.15.tar.gz
  • Upload date:
  • Size: 13.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.23

File hashes

Hashes for dftracer-1.0.15.tar.gz
Algorithm Hash digest
SHA256 a3f18fbc8ead8d034474357a211c0968b34cc98231d87ed44de9e2c63f926e86
MD5 ab5a48b5b46ba9bdab5764d7ea57c1ca
BLAKE2b-256 96113d58f105f2f5d4cf2f6628779b4e9270659f3e16987d67c96e98b34a3bb1

See more details on using hashes here.

File details

Details for the file dftracer-1.0.15-cp312-cp312-manylinux_2_39_x86_64.whl.

File metadata

File hashes

Hashes for dftracer-1.0.15-cp312-cp312-manylinux_2_39_x86_64.whl
Algorithm Hash digest
SHA256 051b4fe3549407cc7be06634c43afb2ea47b3536e113f0da04a9c2c74c82c7e1
MD5 f6d0326d52a1a6e180830b6659a612c3
BLAKE2b-256 7d33064359b0b17761a11ef8c65936b473dfab1ba68275cec5fd43a437e40f83

See more details on using hashes here.

File details

Details for the file dftracer-1.0.15-cp311-cp311-manylinux_2_39_x86_64.whl.

File metadata

File hashes

Hashes for dftracer-1.0.15-cp311-cp311-manylinux_2_39_x86_64.whl
Algorithm Hash digest
SHA256 a172af7c44e5e421c9a25a3e0fd4ae8fb16b0d40f0a5b702bbae8044b8936808
MD5 a988d8a04a36a144117cf422e7fc7698
BLAKE2b-256 26ada1a355ab45fd47c6f60da26b45d6dff13ea03fb6322e49ea30a131519c49

See more details on using hashes here.

File details

Details for the file dftracer-1.0.15-cp310-cp310-manylinux_2_39_x86_64.whl.

File metadata

File hashes

Hashes for dftracer-1.0.15-cp310-cp310-manylinux_2_39_x86_64.whl
Algorithm Hash digest
SHA256 162e05f92414f447807ff92382fdea4f28e3dedbe3de5daa27e739ac0dfeb5d6
MD5 73722b562cfb3626ceba88f537c046be
BLAKE2b-256 be2f089350d4620cac6069fb1dc01f23ca89cc509812a5281088d06fc16bf434

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page