Skip to main content

Python libs for DFTracer

Project description

pydftracer

Documentation Status License: MIT Python 3.9+

A lightweight, typed Python interface for DFTracer. Ideal for prototyping, testing, and iterating on code that uses the DFTracer Python API before deploying the complete tracing stack.

Purposes

  • Are prototyping applications that will eventually use full DFTracer
  • Maintain DFTracer API compatibility in environments where tracing is not needed (e.g. production apps)

Installation

pip install dftracer

Development

# Install development dependencies
pip install -e .[dev]

# Using Make (recommended)
make test-parallel       # Run all tests with parallel execution
make test-subprocess     # Run only subprocess-based dftracer tests  
make test-ci             # Run comprehensive tests matching CI configuration
make test-ci-quick       # Run quick tests and checks (faster)
make check-all           # Run all quality checks (lint, format, type-check, test)

# Using pytest directly
pytest tests/ -v -n 4    # All tests with parallel execution
pytest tests/ --cov=dftracer --cov-report=term-missing -v -n 4  # Tests with coverage

Documentation

Full documentation is available at Read the Docs.

To build documentation locally:

pip install .
cd docs
pip install -r requirements.txt
make html

Furthermore, if you want to enable profiling, please see resources below:

Development

Testing

This project uses a comprehensive test suite with subprocess-based isolation for proper dftracer testing.

Running Tests

# Install development dependencies
pip install -e .[dev]

# Using Make (recommended)
make test-parallel    # Run all tests with parallel execution
make test-subprocess  # Run only subprocess-based dftracer tests  
make test-ci          # Run tests matching CI configuration
make check-all        # Run all quality checks (lint, format, type-check, test)

# Using pytest directly
pytest tests/ -v -n 2                                    # All tests with parallel execution
pytest tests/ --cov=dftracer --cov-report=term-missing -v -n 2  # Tests with coverage
pytest tests/ -m subprocess -v -n 2                      # Only subprocess tests

# Use the provided test script (matches CI)
./scripts/test.sh

Test Structure

  • Unit Tests: General functionality tests in tests/test_general.py
  • Integration Tests: Subprocess-based dftracer tests in tests/test_dftracer.py
  • Parallel Execution: Tests run in parallel using pytest-xdist for faster execution
  • Process Isolation: dftracer tests run in separate subprocesses to handle the per-process nature of dftracer

CI/CD

The project uses GitHub Actions for continuous integration with:

  • Multi-Python version testing (3.9, 3.10, 3.11, 3.12)
  • Parallel test execution with coverage reporting
  • Code linting with ruff
  • Type checking with mypy
  • Package building and installation testing

Citation and Reference

The original SC'24 paper describes the design and implementation of the DFTracer code. Please cite this paper and the code if you use DFTracer in your research.

@inproceedings{devarajan_dftracer_2024,
    address = {Atlanta, GA},
    title = {{DFTracer}: {An} {Analysis}-{Friendly} {Data} {Flow} {Tracer} for {AI}-{Driven} {Workflows}},
    shorttitle = {{DFTracer}},
    urldate = {2024-07-31},
    booktitle = {{SC24}: {International} {Conference} for {High} {Performance} {Computing}, {Networking}, {Storage} and {Analysis}},
    publisher = {IEEE},
    author = {Devarajan, Hariharan and Pottier, Loic and Velusamy, Kaushik and Zheng, Huihuo and Yildirim, Izzet and Kogiou, Olga and Yu, Weikuan and Kougkas, Anthony and Sun, Xian-He and Yeom, Jae Seung and Mohror, Kathryn},
    month = nov,
    year = {2024},
}

@misc{devarajan_dftracer_code_2024,
    type = {Github},
    title = {Github {DFTracer}},
    shorttitle = {{DFTracer}},
    url = {https://github.com/LLNL/dftracer.git},
    urldate = {2024-07-31},
    journal = {DFTracer: A multi-level dataflow tracer for capture I/O calls from worklows.},
    author = {Devarajan, Hariharan and Pottier, Loic and Velusamy, Kaushik and Zheng, Huihuo and Yildirim, Izzet and Kogiou, Olga and Yu, Weikuan and Kougkas, Anthony and Sun, Xian-He and Yeom, Jae Seung and Mohror, Kathryn},
    month = jun,
    year = {2024},
}

Acknowledgments

This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344; and under the auspices of the National Cancer Institute (NCI) by Frederick National Laboratory for Cancer Research (FNLCR) under Contract 75N91019D00024. This research used resources of the Argonne Leadership Computing Facility, a U.S. Department of Energy (DOE) Office of Science user facility at Argonne National Laboratory and is based on research supported by the U.S. DOE Office of Science-Advanced Scientific Computing Research Program, under Contract No. DE-AC02-06CH11357. Office of Advanced Scientific Computing Research under the DOE Early Career Research Program. Also, This material is based upon work partially supported by LLNL LDRD 23-ERD-045 and 24-SI-005. LLNL-CONF-857447.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pydftracer-2.0.3.tar.gz (51.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pydftracer-2.0.3-py3-none-any.whl (22.9 kB view details)

Uploaded Python 3

File details

Details for the file pydftracer-2.0.3.tar.gz.

File metadata

  • Download URL: pydftracer-2.0.3.tar.gz
  • Upload date:
  • Size: 51.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pydftracer-2.0.3.tar.gz
Algorithm Hash digest
SHA256 e5824a45ddbad431b6513a91b76f2094a321ea4c95d2fab29aa63b955e9ce3fd
MD5 57d2b31260c8ba07345e6251aeb6ba1f
BLAKE2b-256 9e4290aaaca98aca0c96924276376e6572c0256c93db316d9f32b21a6643ea91

See more details on using hashes here.

File details

Details for the file pydftracer-2.0.3-py3-none-any.whl.

File metadata

  • Download URL: pydftracer-2.0.3-py3-none-any.whl
  • Upload date:
  • Size: 22.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for pydftracer-2.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 e32da01fed1e1bf5282904ed3889308e47d4c88fc6e2c51fedb89439310a6f53
MD5 7dab881e4bcc21a5b251bea67aa899a3
BLAKE2b-256 dda94f81b1cf1cd99fab4bcbce54aa2e410a25049cf2a076a23d2b822510d9d3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page