Skip to main content

TraceCode toolkit "strace" is dynamic build tracer and grapher

Project description

Tracing a build on Linux

TraceCode is a tool to analyze the traced execution of a build, so you can learn which files are built into binaries and ultimately deployed in your distributed software.

This TraceCode toolkit uses strace to capture the system-level trace of a build and can reconstruct how the build transforms and compiles files from this trace aka. the build graph.

1. Tracing a build

See README-build-tracing.rst for tracing a build

2. System requirements and installation

Ensure you have Python installed::

python -v

Install it as needed if not installed, ensuring that it is in your path. See you local Linux distributor for details.

Ensure you have Graphviz installed and in your path::

dot -v

Install it as needed if not installed, ensuring that it is in your path. See http://graphviz.org/ for details.

If not installed, you will see ERROR messages and the results are unlikely to be usable.

3. Install TraceCode

Get it from https://github.com/nexb/tracecode-toolkit-strace and unzip it. The path were this is unzipped will be referred to as <tracecode_dir> later in this document.

Then execute this command to setup TraceCode:

./configure --dev

Finally run the built-in selftest to verify your installation:

py.test -vvs tests

4. Install strace

One debian:

sudo apt-get strace

5. Analyze your build

Analyzing a traced build is a multi-stage process that involves:

  • parsing and checking the initial traces,

    • optionally filtering the parsed traces,

    • optionally collecting the inventory of files read and written during the build,

  • creating the list of source (input) and target (output) files for your build,

  • analyzing the build graph to determine the source to target relationships, such as source code files being built into a binary,

    • optionally creating graphical representations to visualize subset of your build graph.

Each of these steps is performed by invoking tracecode from the command line with different options and arguments.

Run the trace analysis with:

tracecode <options> <command> <arguments>

For command help use:

tracecode -h

Tutorial

See README-build-tracing.rst for extra details.

0. Trace a command

Use strace this way:

$(which strace) -ff -y -ttt -qq -a1 \
-o {NEW EMPTY tracing_dir}/{trace prefix} \
{build command}

1. Parse the collected raw traces

Create a new empty directory to store parsed traces. Then parse using the “parse” command:

tracecode parse <RAW TRACES DIR INPUT> <PARSED TRACES DIR OUTPUT>

This will parse the traces and ensure they can be processed and are complete

2. Collect the inventory of files processed during the tracing

If traces are consistent the next step is to collect the inventories of files reads and writes. Use the “list” command (which should be called inventory). It creates two files from a parsed trace: a list of files being only read and a list of files being written:

tracecode list <PARSED TRACES DIR INPUT> <READS OUTPUT FILE> <WRITES OUTPUT FILE>

The list command extracts all the paths used in the traces.

4. optional: Guess sources and targets

You can use the “guess” command to guess sources and targets, but that is just a guess. Guessing works ok on small well defined simple codebases, but might noy likely be good on larger ones.

The guess goes this way:
  • files that are only ever read from are likely the source/devel

  • files that are only ever written to read are likely the target/deployed

5. Assemble the inventory of sources an targets

Once you have filtered your parsed trace, you need to create a list of files that are your sources, origin development files and another list that are your targets, deployed files. You need to build theses inventories each in a separate file. You can try the guess command, but that is just a wild guess based on the graph. The paths should have exactly the same structure as in the “list” output. The sources and targets files should be among the reads and writes, so you can use these lists as an input. Alternatively you can use keep an output of the find command before your tracing (your sources) and after and diff it to find what would be the candidates.

Use these lists again to build new lists to define what is the list of devel/sources files and what is the list of deployed/targets files.

6. Analyze sources to targets transformations

Then you can run either the analyze command to get the source to target deployment analysis.

7. optional: Graph select subset of sources to targets transformations

You can selectively create a graphic tracing the transformation from several sources to a one target or several targets to one sources with graphics (selectively because this takes long time to run and large graphics are impossible to visualize)

FAQ:

Q: When parsing raw traces I am getting this error:

ERROR:tracecode:INCOMPLETE TRACE, 149249 orphaned trace(s) detected. First pid is: 3145728.

A: This is a serious error and means that your trace is not coherent as some process traces could not be related to the initial command launch graph and are therefore unrelated. This can happen if you mistakenly trace several commands and store the strace output in the same directory. You need to recollect your traces starting with a clean empty directory.

Q: When parsing raw traces I am getting several warnings:

WARNING:tracecode:parse_line: Unable to decode descriptor for pid: 3097012, line: '1399882436.807573 dup2(5</extra/linux-2.6.32/scripts/mksysmap>, 255) = 255\n'

A: This is just a warning that you can ignore most of the times. Here a file descriptor 255 does not (and cannot) exist, hence the warning.

License

  • Apache-2.0

  • Multiple licenses (GPL2/3, LGPL, MIT, BSD, etc.) for third-party dependencies.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tracecode-toolkit-strace-0.21.0.tar.gz (14.0 MB view details)

Uploaded Source

Built Distribution

tracecode_toolkit_strace-0.21.0-py3-none-any.whl (83.3 kB view details)

Uploaded Python 3

File details

Details for the file tracecode-toolkit-strace-0.21.0.tar.gz.

File metadata

File hashes

Hashes for tracecode-toolkit-strace-0.21.0.tar.gz
Algorithm Hash digest
SHA256 b88ce25ae68eba26ab35f1c8029f616ceec941536c8b5956ec8d8de0a3d8ce7c
MD5 cb34e3f99cc37afe9514ea7f62e84684
BLAKE2b-256 7e2d14882224cb1e3f0c112cb87561ed337dbe56add10ff00a2f50dc452c465c

See more details on using hashes here.

File details

Details for the file tracecode_toolkit_strace-0.21.0-py3-none-any.whl.

File metadata

File hashes

Hashes for tracecode_toolkit_strace-0.21.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5d70d86719767ef6e8b4edc33749e8b746b3ae6d790ac44fb2aa44812439a1ba
MD5 a600ad7531b12126b993f67f4bcdcf97
BLAKE2b-256 2051f8c1e720ae77dd9ac8f0f552c285b17f2be33b191c235d48b096d160d684

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page