GitHub Collaboration Relation Extraction
Reason this release was yanked:
NAME ERROR : HTTPError: 400 Bad Request from https://upload.pypi.org/legacy/ Filename 'gh-core-2.3.0.1.tar.gz' is invalid, should be 'gh_core-2.3.0.1.tar.gz'.
Project description
GitHub_Collaboration_Relation_Extraction
Collaboration Relation Extraction from GitHub logs. Collaboration relations include 2 categories: EventAction relations and Reference relations. This is a relation extraction tool for Project https://github.com/birdflyi/OSDB_STN_reference_coupling_data_analysis.
Quick Start
- Download the directory
etc/and filemain.pyin GitHub_Collaboration_Relation_Extraction into the root directory of your new project. - Change the default settings in
etc/authConf.py.
- AuthConfig
- You need to set the DEFAULT_INTMED_MODE in [I_AUTH_SETTINGS_LOCAL_HOSTS, I_AUTH_SETTINGS_ALIYUN_HOSTS, I_AUTH_SETTINGS_ALIYUN_INTERMEDIATE_HOSTS], and set the corresponding auth_settings_xxx_hosts dict.
- If you have an Aliyun Cloud or other database service within github log tables, please set the server authorization information below the line Aliyun
- If you want a sample dataset to start, you can Download a ClickHouse sample data for your docker container, and set the server authorization information below the line local docker image.
- GITHUB_TOKENS
- You need to replace the GITHUB_TOKENS with effective GitHub tokens start with 'gh', if you donot have any GitHub token, try to Creating a fine-grained personal access token.
- Change the settings in
main.pyand run it.
- Change the
repo_namesandyearsettings- Notes: It may take a lot of time to process all records. Set
limitas a positive integer to limit the max number of records when you just want to take a test.
- Notes: It may take a lot of time to process all records. Set
- Create the
data/directory- Create the directory in the root directory of your project: data_dirs = ['data', 'data/github_osdb_data', 'data/global_data', 'data/github_osdb_data/repos', 'data/github_osdb_data/repos_dedup_content', 'data/github_osdb_data/GitHub_Collaboration_Network_repos']. Make directories:
import os
base_dir = '' or os.getcwd() # you can set a base dir or use the current dir by default.
data_dirs = ['data', 'data/github_osdb_data', 'data/global_data', 'data/github_osdb_data/repos', 'data/github_osdb_data/repos_dedup_content', 'data/github_osdb_data/GitHub_Collaboration_Network_repos']
for rel_data_dir in data_dirs: \
os.makedirs(os.path.join(base_dir, rel_data_dir), exist_ok=True) # avoid the FileExistsError
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file gh_core-2.3.0.1-py2.py3-none-any.whl.
File metadata
- Download URL: gh_core-2.3.0.1-py2.py3-none-any.whl
- Upload date:
- Size: 130.3 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7e8a8ba50fdfbe91c9ad1785f3aa93b730719195e71db6fbc7d3dfc7de104c25
|
|
| MD5 |
f3efe268ee95bbcd1f2ced4de6386a9a
|
|
| BLAKE2b-256 |
c3d274f2782487b714aa9287e7eec195f4ef966e8b64760fd259b04e3961109f
|