Skip to main content

Metadata extraction tool for git repositories.

Project description

diffhouse

diffhouse is a git metadata extraction tool for Python that retrieves high-quality repository information such as commit history, branches, diffs and more.

Requirements

Git 2.19 or greater.

Quick start

  1. Install diffhouse with pip:
pip install diffhouse
  1. Import the Repo class in Python:
from diffhouse import Repo
  1. Create a Repo instance with the git repository URL as an argument. Set blobs to True to load file-level diffs as well.

Note that blobs=True greatly increases processing time, as it requires a complete clone of the repository.

r = Repo(url='https://github.com/user/name.git', blobs=True)
  1. Access data through the following pandas DataFrames:
Table Description
Repo.commits Commit history.
Repo.branches Branch names.
Repo.tags Tag names.
Repo.diffs File-level changes. Available if blobs is True.

For a full list of metadata tables and columns, see the documentation.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

diffhouse-0.2.0.tar.gz (7.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

diffhouse-0.2.0-py3-none-any.whl (7.5 kB view details)

Uploaded Python 3

File details

Details for the file diffhouse-0.2.0.tar.gz.

File metadata

  • Download URL: diffhouse-0.2.0.tar.gz
  • Upload date:
  • Size: 7.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for diffhouse-0.2.0.tar.gz
Algorithm Hash digest
SHA256 ee8cc7ab273e02489a85b9a54986d918baa814d6ee651dd2810d0a410b3a8d41
MD5 adab8b7acaca588624d6585024aa7094
BLAKE2b-256 947f723c59b0b4920e3302a487b928ed893c2be4d1b921d5e23ea795f5c597ec

See more details on using hashes here.

File details

Details for the file diffhouse-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: diffhouse-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 7.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.2

File hashes

Hashes for diffhouse-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 a3768555f83b990865c61f10e4107fafebce8cdc6be118fba3127ad7ea967f54
MD5 b6ca2ed91aa9d4062b67a9a37dec52d5
BLAKE2b-256 eb5b8f67980422740c87cf66adde889a6e8c1d5d53713aa4b962afa62270e96d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page