Skip to main content

SCM mining utility classes

Project description

PyPi version Python compatibility Build Status

Code Metrics

Code metrics is a simple Python module that leverage the libraries below to generate insight from a source control management (SCM) tool:

  • pandas: for data munching.

  • lizard: for code complexity calculation.

  • cloc.pl (script): for line counts from cloc

  • and your SCM: for now, only Subversion is supported. Looking to add git.

It can generate reports based on Adam Tornhill awesome books.

Installation

To install codemetrics, simply use pip:

pip install codemetrics

Usage

This is a simple tool that makes it easy to retrieve information from your Source Control Management (SCM) repository and hopefully gain insight from it.

The reports available for now are:

  • AgeReport: help see what files/component has not changed in a while or who

    is most familiar with a particular set of files.

  • HotSpotReport: combines line count from cloc with SCM information to identify

    files/components that are complex (many lines of code) and that change often. There are ways to post process the SCM log so that you adjust for mass edits or intraday changes.

  • CoChangeReport: help identify what file/component changes when another part

    of the code base change. This is useful to identify hidden dependencies.

Recipes

Derive components from path

df['component'] = df['path'].str.split('\\').str.get(-2)

Will add a component column equal to the parent folder of the path. If no folder exists, it will show N/A.

For more advanced manipulation like extractions, see Pandas documentation

Aggregate hotspots by component

hotspots_report = cm.HotSpotReport('.')
log, cloc = hotspots_report.get_log(), hotspots_report.get_cloc()
cloc['component'] = cloc['path'].str.split('\\').str.get(-2)
log['component'] = log['path'].str.split('\\').str.get(-2)
hspots = hotspots_report.generate(log,
                                  cloc.groupby('component').sum().reset_index(),
                                  by='component').dropna()
hspots.set_index(['component']).sort_values(by='score', ascending=False)

Will order hotspots at the component level in descending order based on the complexity and the number of changes (see score column).

Exclude massive changesets

age_report = cm.AgeReport('.')
log = age_report.get_log()
threshold = int(log[['revision', 'path']].groupby('revision').
                sum().quantile(.99))
massive = get_massive_changesets(log, threshold)
log_ex_massive = log[~log['revision'].isin(massive['revision'])]

Will exclude changesets with a number of path changed in excess of the 99% percentile.

License

Licensed under the term of MIT License. See attached file LICENSE.txt.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

codemetrics-0.5.1.tar.gz (8.4 kB view details)

Uploaded Source

File details

Details for the file codemetrics-0.5.1.tar.gz.

File metadata

  • Download URL: codemetrics-0.5.1.tar.gz
  • Upload date:
  • Size: 8.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for codemetrics-0.5.1.tar.gz
Algorithm Hash digest
SHA256 bde768f623d70f9d447bfcfee47b4673f11091d07b5a7f1257ff3d5399566886
MD5 e5c008752d97d5a09a1a470c64a80275
BLAKE2b-256 955406634dddd5c92eecd874c764779b9925adf91c22dce6a7003f6d73447fc2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page