Skip to main content

A precise profiler for Python, optimized for data processing tasks in high-performance computing. Capable of sampling with metadata, using minimal instrumentation.

Project description

TraceQ

TraceQ is a specialized tool designed to provide accurate metrics measurements for Python-based data processing applications. It integrates with the Linux /proc filesystem to deliver granular and detailed memory profiling, essential for optimizing resource allocation and improving the efficiency of large-scale computational tasks. Developed as part of a comprehensive study on memory management in Python, TraceQ is particularly effective in high-performance computing settings where precise memory profiling is critical.

Features

  • High accuracy memory profiling using direct measurements from the Linux /proc filesystem.
  • Support of multiple backends for memory profiling, including psutil and tracemalloc.
  • Granular and detailed memory usage analysis.
  • Optimized for data processing tasks.
  • Useful in high-performance computing environments for optimizing resource allocation.

Installation

To install TraceQ, you can use pip:

pip install traceq

Alternatively, you can clone the repository and install it manually:

git clone https://github.com/discovery-unicamp/traceq.git
cd traceq
pip install .

Usage

TraceQ is designed to be easy to integrate into your existing Python projects. Below are some basic usage examples:

Profiling a Python Function

To profile memory usage of a specific function, you can use the profile decorator provided by TraceQ.

from traceq import profile

@profile
def task(data):
    # You function goes here
    pass

Configuration

All the behavior of TraceQ is controlled by a global configuration. Users have multiple options to set and customize this configuration according to their needs:

Configuration File

TraceQ uses a configuration file named traceq.toml, which should be placed in the root of your project directory. This file allows you to specify various settings to control the behavior of TraceQ. You can check all the available options on the traceq.toml file in this repository. Below is an example of a traceq.toml configuration file:

Example Customization

Here’s an example of how you can customize some fields in the traceq.toml file:

output_dir = "./traceq_reports"

[logger]
enabled_transports = "console,file"
level = "debug"

[profiler]
enabled_metrics = "memory_usage"
sign_traces = "true"
precision = "3"

[profiler.memory_usage]
enabled_backends = "psutil,tracemalloc"

In this example, the output directory for reports is changed, logging is enabled to both console and file with a debug level, only the memory usage metric is enabled, trace signing is turned on, and the precision for profiling is increased. Finally, memory usage backends are limited to psutil and tracemalloc.

Runtime Configuration

Alternatively, you can load the configuration file at runtime using the load_config function provided by TraceQ. This allows you to dynamically inject configuration settings while your application is running.

from traceq import load_config

load_config({
    "output_dir": "./traceq_reports",
    "logger": {
        "enabled_transports": "console,file",
        "level": "debug"
    },
    "profiler": {
        "enabled_metrics": "memory_usage",
        "sign_traces": "true",
        "precision": "3",
        "memory_usage": {
            "enabled_backends": "psutil,tracemalloc"
        }
    }
})

Environment Variables

You can also set configuration options using environment variables. All environment variables should be prefixed with TRACEQ_. This method is useful for dynamically setting configurations without modifying the code or configuration files.

Example Environment Variables

export TRACEQ_OUTPUT_DIR="./traceq_reports"
export TRACEQ_LOGGER_ENABLED_TRANSPORTS="console,file"
export TRACEQ_LOGGER_LEVEL="debug"
export TRACEQ_PROFILER_ENABLED_METRICS="memory_usage"
export TRACEQ_PROFILER_SIGN_TRACES="true"
export TRACEQ_PROFILER_PRECISION="3"
export TRACEQ_PROFILER_MEMORY_USAGE_ENABLED_BACKENDS="psutil,tracemalloc"

This flexibility allows you to tailor TraceQ's behavior to fit the specific requirements of your seismic data processing tasks, ensuring optimal performance and resource utilization.

Report

After the execution of your Python script, TraceQ will generate a report containing all the metrics collected during the execution. The report will be a .prof file, which is encoded as a Gzipped Message Pack file.

We are still under development, and we are working on a tool to visualize the reports generated by TraceQ.

Contributing

We welcome contributions to TraceQ! If you have any ideas, suggestions, or bug reports, please open an issue on the Github repository. If you would like to contribute code, please fork the repository and submit a pull request.

License

TraceQ is licensed under the MIT License. See the LICENSE file for more details.

Acknowledgments

This tool was developed as part of a comprehensive study on memory management in Python-based seismic data processing applications, conducted by Daniel L. Fonseca and Edson Borin at the Institute of Computing, Unicamp, Brazil. Special thanks to Petrobras for their support and collaboration.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

traceq-0.0.1.tar.gz (22.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

traceq-0.0.1-py3-none-any.whl (31.5 kB view details)

Uploaded Python 3

File details

Details for the file traceq-0.0.1.tar.gz.

File metadata

  • Download URL: traceq-0.0.1.tar.gz
  • Upload date:
  • Size: 22.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.8.10

File hashes

Hashes for traceq-0.0.1.tar.gz
Algorithm Hash digest
SHA256 2d8ba890bfdfdc82001d0c77c8bcf5900cfa938c920ed8fe7938132a5eed54e6
MD5 66806b5b8c6ce78e5c996754e5ab1fc3
BLAKE2b-256 cd24e776d4cd0df5c31bf913a85a66a06f6357cead17149973ab2666329dbadf

See more details on using hashes here.

File details

Details for the file traceq-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: traceq-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 31.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.8.10

File hashes

Hashes for traceq-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 bc73340d6b804980e12ec64d6425b48471780b88a0dfd816d0f8c6100d872d1f
MD5 1a954c7615a46e14dda4b57264d26972
BLAKE2b-256 ab70f2d77d6690b775516bf1d544ae538b660a86e166c6d4788aa60ccfa2002f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page