This Python package is designed for web crawling through input links that belong to GitHub domains. It offers a wide range of functionalities beyond simple crawling, including the ability to list repositories associated with the provided link, download GitHub repositories, and extract the contents of GitHub repositories.
Project description
github-domain-scraper
The github-domain-scraper
is a powerful tool for extracting valuable information from GitHub domains. It provides a
wide
variety of use-cases, making it a versatile solution for various scenarios.
Installation
You can install the github-domain-scraper
from PyPI:
python -m pip install github-domain-scraper
The reader is supported on Python 3.8 and above.
How to use
The github-domain-scraper
is having wide variety of use-cases
Command-line Tool
You can use the github-domain-scraper
as a command-line tool to extract information from GitHub domains:
python -m github_domain_scraper --link=https://github.com/Parth971
You can also specify a JSON output file for the results:
python -m github_domain_scraper --link=https://github.com/Parth971 --json=repo.json
Integration in Python Modules
The github-domain-scraper
can also be seamlessly integrated into other Python modules.
Import the LinkExtractor
class from github_domain_scraper.link_extractor
and use it as
follows:
from github_domain_scraper.link_extractor import LinkExtractor
links = LinkExtractor(initial_link="github_link").extract()
This makes it easy to incorporate github-domain-scraper functionality into your custom Python projects.
License
This project is licensed under the MIT License - see the LICENSE.md file for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for github-domain-scraper-1.0.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | b14bb728b107ab0af9180ab0b8cc3fd067200f9f8ee9f9526535878e6f0ba55c |
|
MD5 | 0147c67529e361eec258cd76652fd839 |
|
BLAKE2b-256 | c214372562baa845276c5d67cd8f2817156d4ce0f664b827ff6cb32fb03299f1 |
Hashes for github_domain_scraper-1.0.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 523c28b0a18e864fedb57e5be3f0c4b6f74b8d981651c5b7d944c3e228bd314f |
|
MD5 | 4c65283ba3674468408142e206b5fa06 |
|
BLAKE2b-256 | 994bab3382bdc4df69703e7a5826537ccfa63ee5d4e4aa92c6c97b1efda351a5 |