A tool to query GraphQL for collecting repositories metadata.
Project description
radon-repositories-collector
A Python package to query GraphQL for collecting GitHub repositories metadata.
Install
The package can be installed from PyPI as follows:
pip install repositories-collector
Python usage
import os
from datetime import datetime
from repocollector.github import GithubRepositoriesCollector
github_crawler = GithubRepositoriesCollector(
access_token=os.getenv('GITHUB_ACCESS_TOKEN'), # or paste your token
since=datetime(2019, 12, 31),
until=datetime(2020, 12, 31),
pushed_after=datetime(2020, 6, 1),
min_issues=0,
min_releases=0,
min_stars=0,
min_watchers=0,
primary_language='language') # e.g., python
for repo in github_crawler.collect_repositories():
print('id:', repo['id']) # e.g., 123456
print('default_branch:', repo['default_branch']) # e.g., main
print('owner:', repo['owner']) # e.g., radon-h2020
print('name:', repo['name']) # e.g., radon-repositories-collector
print('url:', repo['url'])
print('description:', repo['description'])
print('issues:', repo['issues'])
print('releases:', repo['releases'])
print('stars:', repo['stars'])
print('watchers:', repo['watchers'])
print('primary_language:', repo['primary_language'])
print('created_at:', repo['created_at'])
print('pushed_at:', repo['pushed_at'])
print('dirs:', repo['dirs']) # list of repo's root directories, e.g., [repocollector]
Command-line usage
usage: repositories-collector [-h] [-v] [--from DATE_FROM]
[--to DATE_TO] [--pushed-after DATE_PUSH]
[--min-issues MIN_ISSUES]
[--min-releases MIN_RELEASES]
[--min-stars MIN_STARS]
[--min-watchers MIN_WATCHERS] [--verbose]
dest
A Python library to collect repositories metadata from GitHub.
positional arguments:
dest destination folder for report
optional arguments:
-h, --help show this help message and exit
-v, --version show program's version number and exit
--from DATE_FROM collect repositories created since this date (default:
2014-01-01 00:00:00)
--to DATE_TO collect repositories created up to this date (default:
2014-01-01 00:00:00)
--pushed-after DATE_PUSH
collect only repositories pushed after this date
(default: 2019-01-01 00:00:00)
--min-issues MIN_ISSUES
collect repositories with at least <min-issues> issues
(default: 0)
--min-releases MIN_RELEASES
collect repositories with at least <min-releases>
releases (default: 0)
--min-stars MIN_STARS
collect repositories with at least <min-stars> stars
(default: 0)
--min-watchers MIN_WATCHERS
collect repositories with at least <min-watchers>
watchers (default: 0)
--primary-language LANGUAGE
collect repositories written in this language
--verbose show log (default: False)
Important! The tool requires a personal access token to access the GraphQL APIs. See how to get one here.
Add GITHUB_ACCESS_TOKEN=<paste here your token>
to the environment variables.
Output Running the tool from command-line generates an HTML report accessible at <dest>/report.html.
Example The following command search for repositories written in python created between 2014-02-01 and 2014-02-03. The report is saved in the folder /tmp/
repositories-collector 2014-02-01 2014-02-03 /tmp/ --primary-language python
Contributions
To report bugs, visit the issue tracker.
In case you want to play with the source code or contribute improvements, see CONTRIBUTING.
Version
[0.0.2] Fixed missed import of config.json in MANIFEST.in
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for repositories_collector-0.0.4.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4f4b2d0a8914f2821383a4aa3bc21c176f57949dbdc62380041916c6756a30d6 |
|
MD5 | 682d4198b58e5684f98fa597e9414cc8 |
|
BLAKE2b-256 | 7618ae9ae77dd0cecefd11b495022660f4005b4f4ec5022732eb171d9ac3eb14 |
Hashes for repositories_collector-0.0.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 806080630eebf89394f44d5bc7deceaad1c4a1ee3e7850ffd856a077e9117ae4 |
|
MD5 | 3f1d413f4188752fe2b27d1db7aa73fa |
|
BLAKE2b-256 | 848745486062eba3ffb3de2d16d90594f0d4c507e9a8713c08c2bc8889bfdf9e |