Skip to main content

COmmit CLUstering and REpository MIning for Git

Project description

CoCluReMiG - COmmit CLUstering and REpository MIning for Git

A simple to use library for mining git repositories.

Usage

Commit graph

import cocluremig.utils.gitutils as gitutils

#get git repository (saved by default to tempdir)
repo = gitutils.get_repo("https://github.com/mmonschau/cocluremig")

#get commit_graph
(edges, commits) = gitutils.get_edge_list(repo)

Pre-Defined Commit Metric

import cocluremig.utils.gitutils as gitutils
import cocluremig.analyzer.commit.analyzers as c_analyzers

repo = gitutils.get_repo("https://github.com/mmonschau/cocluremig")

file_type_analyzer = c_analyzers.get_file_number_per_extension_analyzer(repo)

for c in c_analyzers.get_all_commits(repo):
    
    c_analyzers.get_basic_commit_data(c)
    # sha, date_committed, date_authored, signed, author_name, author_mail, committer_name, committer_mail
    file_type_analyzer.apply_metric(c)
    # {'py':26,'md':1,'toml':1,'cfg':1}

Own Commit Metric

import cocluremig.utils.gitutils as gitutils
import cocluremig.analyzer.commit
import cocluremig.analyzer.commit.analyzers
import cocluremig.analyzer.commit.base_analyzer
import cocluremig.analyzer.commit.blob_inspectors

repo = gitutils.get_repo("https://github.com/mmonschau/cocluremig")

def get_tokens(blob):
    text = cocluremig.analyzer.commit.blob_inspectors.get_text_representation(blob)
    # import git ...
    tokens = set(text.split())
    return tokens
    

analyzer = cocluremig.analyzer.base_analyzer.RepoFileMetricAnalyzer(repo,get_tokens,lambda x,y : x.union(y), set() )

for c in cocluremig.analyzer.get_all_commits(repo):
    analyzer.apply_metric(c)
    # {if, is, for, in, ...}

Other exmaples

see samples-folder

LICENSE

GPLv3+

I decided to use GPL because it is really annoying for reproduction if researchers just publish some random pseudocode in a paper. This enforces further development on this library to be public.

Project details


Release history Release notifications | RSS feed

This version

0.1

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

CoCluReMiG-0.1.tar.gz (35.4 kB view details)

Uploaded Source

Built Distribution

CoCluReMiG-0.1-py3-none-any.whl (41.9 kB view details)

Uploaded Python 3

File details

Details for the file CoCluReMiG-0.1.tar.gz.

File metadata

  • Download URL: CoCluReMiG-0.1.tar.gz
  • Upload date:
  • Size: 35.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.6.0 importlib_metadata/4.8.2 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.8

File hashes

Hashes for CoCluReMiG-0.1.tar.gz
Algorithm Hash digest
SHA256 68d42ee63e7a86e4453b93ba1ec7c5d38ad098ffb618b8aa32f3841d749cae3a
MD5 7a41b65ec100c2d77ae666f46f3ac6dd
BLAKE2b-256 dc190b6202ae1463ba70aba8d0b2e8cb24340fac77616638f067f120a26ab4ad

See more details on using hashes here.

File details

Details for the file CoCluReMiG-0.1-py3-none-any.whl.

File metadata

  • Download URL: CoCluReMiG-0.1-py3-none-any.whl
  • Upload date:
  • Size: 41.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.6.0 importlib_metadata/4.8.2 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.8

File hashes

Hashes for CoCluReMiG-0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8c1e335ab55944e6dfc708ba837173a1dec172c02d0d8cdbdeed13abd38efae6
MD5 ab981274e4712c040799a645550b562f
BLAKE2b-256 6dc7616acc792444fdb577be4a13a068f387d622b034e9314e5fc4d8a4440e39

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page