github2pandas supports the aggregation of project activities in a GitHub repository and makes them available in pandas dataframes
Project description
Transform GitHub Activities to Pandas Dataframes
General information
This package is being developed by the participating partners (TU Bergakademie Freiberg, OVGU Magdeburg and HU Berlin) as part of the DiP-iT project Website.
The package implements Python functions for
- aggregating and preprocessing GitHub activities (Commits, Actions, Issues, Pull-Requests) and
- generating project progress summaries according to different metrics (ratio of changed lines, ratio of aggregated Levenshtein distances e.g.).
github2pandas
stores the collected information in a collection of pandas DataFrames starting from a user defined root folder. The structure beyond that (file names, folder names) is defined as a member variable in the corresponding classes and can be overwritten. The default configuration results in the following file structure.
|-- My_Github_Repository_0 <- Repository name
| |- Repo.json <- Json file containing user and repo name
| |- Issues
| | |- pdIssuesComments.p
| | |- pdIssuesEvents.p
| | |- pdIssues.p
| | |- pdIssuesReactions.p
| |- PullRequests
| | |- pdPullRequestsComments.p
| | |- pdPullRequestsCommits.p
| | |- pdPullRequestsEvents.p
| | |- pdPullRequests.p
| | |- pdPullRequestsReactions.p
| | |- pdPullRequestsReviews.p
| |- Users.p
| |- Versions
| | |- pdCommits.p
| | |- pdEdits.p
| | |- pdBranches.p
| | |- pVersions.db
| | |- repo <- Repository clone
| | | |- ..
| |- Workflows
| |- pdWorkflows.p
|-- My_Github_Repository_1
...
The internal structure and relations of the data frames are included in the project's wiki.
Installation
github2pandas
is available on pypi. Use pip to install the package.
sudo pip3 install github2pandas
Application
GitHub token is required for use, which is used for authentication. The website describes how you can generate this for your GitHub account. Customise the username and project name and explore any public or private repository you have access to with your account!
The corresponding github2pandas_notebooks repository illustrates the usage with examplary investigations.
The documentation of the module is available at https://github2pandas.readthedocs.io/.
Working with pipenv
Process | Command |
---|---|
Installation | pipenv install --dev |
Run specific script | pipenv run python file.py |
Run all Tests | pipenv run python -m unittest |
Run all tests in a specific folder | pipenv run python -m unittest discover -s 'tests' |
Run all tests with specific filename | pipenv run python -m unittest discover -p 'test_*.py' |
Start Jupyter server in virtual environment | pipenv run jupyter notebook |
For Contributors
Naming conventions: https://namingconvention.org/python/
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file github2pandas-1.1.8.tar.gz
.
File metadata
- Download URL: github2pandas-1.1.8.tar.gz
- Upload date:
- Size: 16.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.8.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5e8a056642082e7f27b59fc265702d4235c2c471b37c34aa277456c986b4aab5 |
|
MD5 | a930659e3ded59f3859d2846d3ece464 |
|
BLAKE2b-256 | 756499dc988039deb12494fd9dce60a2ec51fb41b69e9af490d90c5ad004881c |
File details
Details for the file github2pandas-1.1.8-py3-none-any.whl
.
File metadata
- Download URL: github2pandas-1.1.8-py3-none-any.whl
- Upload date:
- Size: 19.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.8.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d4e8a1467e6b9dea4226cd42746d1ac3c2442621c3b59c647e835dd0305e3bb7 |
|
MD5 | ce151be039a04a91cb7ba63c7348f251 |
|
BLAKE2b-256 | fa7cb6313bba3e19907ffb159208d762480ada9b841cd7b9b03efbcff6f40403 |