Skip to main content

I/O profiler for deep learning python apps. Specifically for dlio_benchmark.

Project description

DFTracer

Build and Test Documentation Status PyPI - Version PyPI - Wheel PyPI - Python Version PyPI - License

Overview

DFTracer is a tracing tool designed to capture both application-code and I/O-call level events from workflows. It provides a unified tracing interface, optimized trace format, and compression mechanism to enable efficient distributed analysis for large-scale AI-driven workloads.

Prerequisites

Requirements for DFTracer

  1. Python>=3.7
  2. pybind11

Requirements for DFAnalyzer

  1. bokeh>=2.4.2
  2. dask>=2023.5.0
  3. distributed
  4. matplotlib>=3.7.3
  5. numpy>=1.24.3
  6. pandas>=2.0.3
  7. pyarrow>=12.0.1
  8. pybind11
  9. python-intervals>=1.10.0.post1
  10. rich>=13.6.0
  11. seaborn>=0.13.2
  12. zindex_py

Installation

Users can easily install DFTracer using pip, the standard tool for installing Python packages. This method works for both native Python and Conda environments.

From PyPI

pip install dftracer
pip install dftracer[dfanalyzer]

From Github

DFTRACER_VERSION=develop
pip install git+https://github.com/LLNL/dftracer.git@${DFTRACER_VERSION}
pip install git+https://github.com/LLNL/dftracer.git@${DFTRACER_VERSION}#egg=dftracer[dfanalyzer]

From Source

git clone git@github.com:LLNL/dftracer.git
cd dftracer
# You can skip this for installing the dev branch.
# for latest stable version use master branch.
git checkout tags/<Release> -b <Release>
pip install .

For detailed build instructions, click here.

Usage

from dftracer.logger import dftracer, dft_fn
log_inst = dftracer.initialize_log(logfile=None, data_dir=None, process_id=-1)
dft_fn = dft_fn("COMPUTE")

# Example of using function decorators
@dft_fn.log
def log_events(index):
    sleep(1)

# Example of function spawning and implicit I/O calls
def posix_calls(val):
    index, is_spawn = val
    path = f"{cwd}/data/demofile{index}.txt"
    f = open(path, "w+")
    f.write("Now the file has more content!")
    f.close()
    if is_spawn:
        print(f"Calling spawn on {index} with pid {os.getpid()}")
        log_inst.finalize() # This need to be called to correctly finalize DFTracer.
    else:
        print(f"Not calling spawn on {index} with pid {os.getpid()}")

# NPZ calls internally calls POSIX calls.
def npz_calls(index):
    path = f"{cwd}/data/demofile{index}.npz"
    if os.path.exists(path):
        os.remove(path)
    records = np.random.randint(255, size=(8, 8, 1024), dtype=np.uint8)
    record_labels = [0] * 1024
    np.savez(path, x=records, y=record_labels)

def main():
    log_events(0)
    npz_calls(1)
    with get_context('spawn').Pool(1, initializer=init) as pool:
        pool.map(posix_calls, ((2, True),))
    log_inst.finalize()

if __name__ == "__main__":
    main()

For this example, as the dftracer.initialize_log do not pass logfile or data_dir, we need to set DFTRACER_LOG_FILE and DFTRACER_DATA_DIR. By default the DFTracer mode is set to FUNCTION. Example of running this configurations are:

# The process id, app_name and .pfw will be appended by DFTracer for each app and process.
# The name of the final log file will be ~/log_file-<APP_NAME>-<PID>.pfw
DFTRACER_LOG_FILE=~/log_file
# Colon separated paths to include in the tracing
DFTRACER_DATA_DIR=/dev/shm/:/p/gpfs1/$USER/dataset:$PWD/data
# Enable DFTracer
DFTRACER_ENABLE=1

For more examples, click here.

Documentation

Citation and Reference

The original SC'24 paper describes the design and implementation of the DFTracer code. Please cite this paper and the code if you use DFTracer in your research.

@inproceedings{devarajan_dftracer_2024,
    address = {Atlanta, GA},
    title = {{DFTracer}: {An} {Analysis}-{Friendly} {Data} {Flow} {Tracer} for {AI}-{Driven} {Workflows}},
    shorttitle = {{DFTracer}},
    urldate = {2024-07-31},
    booktitle = {{SC24}: {International} {Conference} for {High} {Performance} {Computing}, {Networking}, {Storage} and {Analysis}},
    publisher = {IEEE},
    author = {Devarajan, Hariharan and Pottier, Loic and Velusamy, Kaushik and Zheng, Huihuo and Yildirim, Izzet and Kogiou, Olga and Yu, Weikuan and Kougkas, Anthony and Sun, Xian-He and Yeom, Jae Seung and Mohror, Kathryn},
    month = nov,
    year = {2024},
}

@misc{devarajan_dftracer_code_2024,
    type = {Github},
    title = {Github {DFTracer}},
    shorttitle = {{DFTracer}},
    url = {https://github.com/LLNL/dftracer.git},
    urldate = {2024-07-31},
    journal = {DFTracer: A multi-level dataflow tracer for capture I/O calls from worklows.},
    author = {Devarajan, Hariharan and Pottier, Loic and Velusamy, Kaushik and Zheng, Huihuo and Yildirim, Izzet and Kogiou, Olga and Yu, Weikuan and Kougkas, Anthony and Sun, Xian-He and Yeom, Jae Seung and Mohror, Kathryn},
    month = jun,
    year = {2024},
}

Acknowledgments

This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344; and under the auspices of the National Cancer Institute (NCI) by Frederick National Laboratory for Cancer Research (FNLCR) under Contract 75N91019D00024. This research used resources of the Argonne Leadership Computing Facility, a U.S. Department of Energy (DOE) Office of Science user facility at Argonne National Laboratory and is based on research supported by the U.S. DOE Office of Science-Advanced Scientific Computing Research Program, under Contract No. DE-AC02-06CH11357. Office of Advanced Scientific Computing Research under the DOE Early Career Research Program. Also, This material is based upon work partially supported by LLNL LDRD 23-ERD-045 and 24-SI-005. LLNL-CONF-857447.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

dftracer-1.0.15.dev0-cp312-cp312-manylinux_2_34_x86_64.whl (8.7 MB view details)

Uploaded CPython 3.12manylinux: glibc 2.34+ x86-64

dftracer-1.0.15.dev0-cp311-cp311-manylinux_2_34_x86_64.whl (8.7 MB view details)

Uploaded CPython 3.11manylinux: glibc 2.34+ x86-64

dftracer-1.0.15.dev0-cp310-cp310-manylinux_2_34_x86_64.whl (8.7 MB view details)

Uploaded CPython 3.10manylinux: glibc 2.34+ x86-64

dftracer-1.0.15.dev0-cp39-cp39-manylinux_2_39_x86_64.whl (8.5 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.39+ x86-64

dftracer-1.0.15.dev0-cp39-cp39-manylinux_2_34_x86_64.whl (8.5 MB view details)

Uploaded CPython 3.9manylinux: glibc 2.34+ x86-64

File details

Details for the file dftracer-1.0.15.dev0-cp312-cp312-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for dftracer-1.0.15.dev0-cp312-cp312-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 37d396099b82355ca2c800a6ab18336e9568095711103cf4e994b445e1ee9ff7
MD5 e157308be310e8302094f0a981350687
BLAKE2b-256 5309b0bf71aff9137cc16cb11e2e5493d06c4a52171f21c49c0a5f801589fdba

See more details on using hashes here.

File details

Details for the file dftracer-1.0.15.dev0-cp311-cp311-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for dftracer-1.0.15.dev0-cp311-cp311-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 e6eed4b945cdd12d1e1c3fa4a0a5458a57df9f3b42efefb01e49e01583b5d153
MD5 f45abd541defb8292202299d1da925ba
BLAKE2b-256 c4a5290252ef15c70092a787aa2d8c360c1cbe48b2c74a3a0d714763bda351ba

See more details on using hashes here.

File details

Details for the file dftracer-1.0.15.dev0-cp310-cp310-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for dftracer-1.0.15.dev0-cp310-cp310-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 01caa3fd8d68222b310195018583aafafbe575c09ae4ac2bd99f11c80b25b745
MD5 1e362cc4dab247d81d4892285e72b525
BLAKE2b-256 7cfee3a0d6b00de4494087968b1f08ea3fe84965aab5edf52faa0a4ae82a53b8

See more details on using hashes here.

File details

Details for the file dftracer-1.0.15.dev0-cp39-cp39-manylinux_2_39_x86_64.whl.

File metadata

File hashes

Hashes for dftracer-1.0.15.dev0-cp39-cp39-manylinux_2_39_x86_64.whl
Algorithm Hash digest
SHA256 f642419fc76392b1b3a2b55ea2ce2aaa3f868562336d293c24da84d5f1ceb42b
MD5 f1607d7c0353aa29ded65a56f5a5dca1
BLAKE2b-256 94e0ca29a1953d2d96bb23c4f3c2f0e400a5566e7355623973d8f38436fbc3e5

See more details on using hashes here.

File details

Details for the file dftracer-1.0.15.dev0-cp39-cp39-manylinux_2_34_x86_64.whl.

File metadata

File hashes

Hashes for dftracer-1.0.15.dev0-cp39-cp39-manylinux_2_34_x86_64.whl
Algorithm Hash digest
SHA256 0ac8333e7ebe656e625916e91b8c0e6d6ed822c0b977fa8c21c4e72a69bd6d5d
MD5 fbf9ec000bc15fe19abd08cb93e6712a
BLAKE2b-256 c681add989b136803a6cd82fe2dc13c52a3ffe62ebc30bfd241ba33e02043fcf

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page