Extract file paths from GitHub repositories categorized by extension
Project description
GitHub Repository Parser
Hosted on github.com/Blink29
Why parse a repository?
- To collect all files of a particular repository, flattened and grouped by extension, is too menial to be done manually.
- Grouped files can be used for further analysis / use / testing, based on requirements.
How does the parser work?
- The parser is developed using the GitHub API.
- Recursive level-wise file parsing is done to obtain absolute path of each file in the repository.
- Simultaneously, files of each type are grouped together and a dictionary of type { extension: array_of_files } is returned.
Installation
pip install github-repo-files-parser
Illustration
from github_repo_files_parser import GitHubRepoFilesParser
parser = GitHubRepoFilesParser()
repo_url = "https://github.com/Blink29/github_repo_files_parser"
parser.get_raw_repo_links(repo_url)
Sample Output
{
"py": [
"https://github.com/Blink29/github_repo_files_parser/blob/main/index.py",
"https://github.com/Blink29/github_repo_files_parser/blob/main/setup.py",
"https://github.com/Blink29/github_repo_files_parser/blob/main/github_repo_files_parser/__init__.py",
"https://github.com/Blink29/github_repo_files_parser/blob/main/github_repo_files_parser/github_repo_files_parser.py"
],
"md": [
"https://github.com/Blink29/github_repo_files_parser/blob/main/README.md"
],
"gitignore": [
"https://github.com/Blink29/github_repo_files_parser/blob/main/.gitignore"
],
"directories": [
"https://github.com/Blink29/github_repo_files_parser/tree/main/github_repo_files_parser"
],
"cfg": [
"https://github.com/Blink29/github_repo_files_parser/blob/main/github_repo_files_parser/setup.cfg"
]
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distribution
Close
Hashes for github_repo_files_parser-1.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 74c4d99ca7780f9964e71cb35dba9e6b929dce7434ad48ea4b3efe512bb11ba5 |
|
MD5 | 02189a3f4169229fe99b8b312e0a67b7 |
|
BLAKE2b-256 | d0405aef9009c369ded48f5d80303ef8683f3e838c65d6f79d5805bd25463a31 |