Skip to main content

Retrieve multi-repo time series Git stats by author

Project description

git-author-stats

test PyPI version

This package provides a CLI and library for extracting author "stats" (insertions and deletions) for a Git repository or Github organization.

Under the hood, these metrics are obtained by:

  1. Cloning truncated versions of all specified repositories (or all repositories in a specified Github organizations) into temp directories
  2. Calculating a series of date ranges based on the temporal limits and frequency you've specified
  3. Using git log --numstat to get a count of the insertions and deletions made by each author during each date range

Please note that this package does not provide functionality for aggregation or analysis of the metrics extracted, instead the output is provided in a format suitable for use with tools such as polars, pandas, and pyspark.

All stats obtained from this package will be unique when grouped by url + commit + author_name + file, and include the following fields:

  • url (str): The URL of the repository (provided because stats for multiple repositories can be obtained with one function call or command)
  • since (date|None): The start date for a pre-defined time period as determined by frequency and time range parameters provided by the user
  • before (date|None): The (non-inclusive) end date for a pre-defined time period as determined by frequency and time range parameters provided by the user
  • author_date (datetime.datetime|None): The date and time of the author's commit
  • author_name (str): The name of the author
  • commit (str): The abbreviated commit hash
  • file (str): The relative path of the modified file
  • insertions (int): The number of lines inserted in this commit (please note that this is always 0 for binary files)
  • deletions (int): The number of lines deleted in this commit (please note that this is always 0 for binary files)

Please note that:

  • The fields since and before are provided as a convenience for easy aggregation of stats, based on parameters provided by the user, but do not provide any additional information about the commit or file
  • All dates and times are expressed in coordinated universal time (UTC), as timezone-unaware datetime.datetime or datetime.date objects in python, and output in ISO 8601 format when written to CSV/TSV files and/or console output

Installation

You can install git-author-stats with pip:

pip3 install git-author-stats

Please note that you will need to specify the extra "github" in your pip install command if you want to extract stats from all repositories owned by a Github organization without needing to provide each repository URL explicitly:

pip3 install 'git-author-stats[github]'

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

git_author_stats-1.0.2.tar.gz (11.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

git_author_stats-1.0.2-py3-none-any.whl (13.4 kB view details)

Uploaded Python 3

File details

Details for the file git_author_stats-1.0.2.tar.gz.

File metadata

  • Download URL: git_author_stats-1.0.2.tar.gz
  • Upload date:
  • Size: 11.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.28.1

File hashes

Hashes for git_author_stats-1.0.2.tar.gz
Algorithm Hash digest
SHA256 76ac420b186d5710d1f148bdbcaef0eb3e8f5b418fb518074af98c07cc3ead6b
MD5 2326d6f8e1876c8ffb362501d02dcd71
BLAKE2b-256 356299cf46e8211aa5d152e80781212757b36d5e0120efc9a0f740d603efdde6

See more details on using hashes here.

File details

Details for the file git_author_stats-1.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for git_author_stats-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 12609cade3de56c0173231d19ae70908710062062fd42b519d66c1ce14271e7d
MD5 56f4d01a5e6339e51c6471b3f076b843
BLAKE2b-256 0b256bd434b2ccdac3a44e8e10e21ad08039a22c3d2162b79b29d5896a4dfc2b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page