Skip to main content

A tool for collecting data from git repositories.

Project description

What is this?

_gitwalker_ is a tool for collecting data from git repositories. It automates the process of checking out each revision, running some command and logging the output to a JSON file. Commands are specified in the form of Python classes.

What can it do?

Currently _gitwalker_ supports two built in commands:

  • A LaTeX word count
  • du disk usage command

Its straightforward to add additional commands - see the file

The included script uses the [matplotlib]( framework to produce time-series graphs overlaying multiple data files.


To word count a git-tracked LaTeX project across all commits:

./ –wordcount myfile.tex –out wordcount.json /path/to/project

This will clone the repository at /path/to/project to a temporary directory before checking out each revision and running a word count on the file myfile.tex in the repository. The results will be output to the file wordcount.json

gitwalker also supports incremental update of a previously produced log file. To add newly committed revisions,

./ –in wordcount.json –wordcount myfile.tex –out wordcount.json /path/to/project

There is an attached script to plot a number of such output files on the same axes using matplotlib. e.g.

./ –plot file1.json me red –plot you.json you blue wordcount/wordcount

Will plot the files file1.json and file2.json on the same axes using the specified labels and colours. The value will be dug out from the JSON file via the path format at the end of the command line - in this case wordcount/wordcount. One could also run

./ –plot file1.json me red –plot you.json you blue wordcount/nfigures

to plot the number of LaTeX figures present in each commit.


  • Add git-notes option
  • Shell command plugin

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for gitwalker, version 0.1
Filename, size File type Python version Upload date Hashes
Filename, size gitwalker-0.1-py2.7.egg (41.4 kB) File type Egg Python version 2.7 Upload date Hashes View
Filename, size gitwalker-0.1.tar.gz (34.4 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page