Skip to main content

Scraping GitHub repository

Project description

reposcraping

Scraping GitHub repository

This library allows you to access the names of files and folders of any GitHub repository. Cloner class allows you to clone the file types you want to the path you want.

downloads

Downloads Downloads Downloads

setup

pip install reposcraping

usage

from reposcraping import RepoScraping

from reposcraping.cloner import Cloner



scraping = RepoScraping(

    "https://github.com/emresvd/random-video",

    p=True,

)



print(scraping.tree_urls)

print(scraping.file_urls)



cloner = Cloner(scraping)

cloner.clone(

    paths={

        ".py": "files/python_files",

        ".txt": "files/text_files",

        ".md": "files/markdown_files",

        ".html": "files/html_files",

        "": "files/other_files",

    },

    only_file_name=True,

    p=True,

)

output

['https://github.com/emresvd/random-video', 'https://github.com/emresvd/random-video/tree/master/random_video', 'https://github.com/emresvd/random-video/tree/master/special_search', 'https://github.com/emresvd/random-video/tree/master/static', 'https://github.com/emresvd/random-video/tree/master/templates', 'https://github.com/emresvd/random-video/tree/master/video', 'https://github.com/emresvd/random-video/tree/master/random_video/__pycache__', 'https://github.com/emresvd/random-video/tree/master/video/__pycache__', 'https://github.com/emresvd/random-video/tree/master/video/migrations', 'https://github.com/emresvd/random-video/tree/master/video/migrations/__pycache__']

['https://github.com/emresvd/random-video/blob/master/LICENSE.md', 'https://github.com/emresvd/random-video/blob/master/.gitignore', 'https://github.com/emresvd/random-video/blob/master/README.md', 'https://github.com/emresvd/random-video/blob/master/db.sqlite3', 'https://github.com/emresvd/random-video/blob/master/manage.py', 'https://github.com/emresvd/random-video/blob/master/requirements.txt', 'https://github.com/emresvd/random-video/blob/master/words.txt', 'https://github.com/emresvd/random-video/blob/master/static/ic_launcher-playstore.png', 'https://github.com/emresvd/random-video/blob/master/random_video/__init__.py', 'https://github.com/emresvd/random-video/blob/master/random_video/asgi.py', 'https://github.com/emresvd/random-video/blob/master/random_video/settings.py', 'https://github.com/emresvd/random-video/blob/master/random_video/urls.py', 'https://github.com/emresvd/random-video/blob/master/random_video/wsgi.py', 'https://github.com/emresvd/random-video/blob/master/special_search/car.txt', 'https://github.com/emresvd/random-video/blob/master/special_search/food.txt', 'https://github.com/emresvd/random-video/blob/master/special_search/rocket.txt', 'https://github.com/emresvd/random-video/blob/master/special_search/space.txt', 'https://github.com/emresvd/random-video/blob/master/special_search/travel.txt', 'https://github.com/emresvd/random-video/blob/master/static/favicon.ico', 'https://github.com/emresvd/random-video/blob/master/templates/download.html', 'https://github.com/emresvd/random-video/blob/master/templates/index.html', 'https://github.com/emresvd/random-video/blob/master/video/__init__.py', 'https://github.com/emresvd/random-video/blob/master/video/admin.py', 'https://github.com/emresvd/random-video/blob/master/video/apps.py', 'https://github.com/emresvd/random-video/blob/master/video/models.py', 'https://github.com/emresvd/random-video/blob/master/video/random_video.py', 'https://github.com/emresvd/random-video/blob/master/video/tests.py', 'https://github.com/emresvd/random-video/blob/master/video/views.py', 'https://github.com/emresvd/random-video/blob/master/random_video/__pycache__/__init__.cpython-37.pyc', 'https://github.com/emresvd/random-video/blob/master/random_video/__pycache__/settings.cpython-37.pyc', 'https://github.com/emresvd/random-video/blob/master/random_video/__pycache__/urls.cpython-37.pyc', 'https://github.com/emresvd/random-video/blob/master/random_video/__pycache__/wsgi.cpython-37.pyc', 'https://github.com/emresvd/random-video/blob/master/video/__pycache__/__init__.cpython-37.pyc', 'https://github.com/emresvd/random-video/blob/master/video/__pycache__/admin.cpython-37.pyc', 'https://github.com/emresvd/random-video/blob/master/video/__pycache__/models.cpython-37.pyc', 'https://github.com/emresvd/random-video/blob/master/video/__pycache__/random_video.cpython-37.pyc', 'https://github.com/emresvd/random-video/blob/master/video/__pycache__/views.cpython-37.pyc', 'https://github.com/emresvd/random-video/blob/master/video/migrations/__init__.py', 'https://github.com/emresvd/random-video/blob/master/video/migrations/__pycache__/__init__.cpython-37.pyc']

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

reposcraping-1.1.7.tar.gz (3.8 kB view details)

Uploaded Source

Built Distribution

reposcraping-1.1.7-py3-none-any.whl (4.3 kB view details)

Uploaded Python 3

File details

Details for the file reposcraping-1.1.7.tar.gz.

File metadata

  • Download URL: reposcraping-1.1.7.tar.gz
  • Upload date:
  • Size: 3.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.7.8

File hashes

Hashes for reposcraping-1.1.7.tar.gz
Algorithm Hash digest
SHA256 3ef3ba2675594ec9ad0f0a6748f027edf016eae75c95d1407306aee173675a48
MD5 51b63589a63a394da38400f8660faf98
BLAKE2b-256 3e1c11df304708e5be2ab6612dfbd85610f53e60cabc32ad6b0376f01d67b390

See more details on using hashes here.

Provenance

File details

Details for the file reposcraping-1.1.7-py3-none-any.whl.

File metadata

File hashes

Hashes for reposcraping-1.1.7-py3-none-any.whl
Algorithm Hash digest
SHA256 4f515b6c2a9673fd87f6ab64501c5ed6a63359e2692c486947715cfe997042ed
MD5 f9bb93394d2d7221e08c8e791db65328
BLAKE2b-256 6074e020a8637e84c2336db3d4bf63e9dbe19170ae809362a1d99515a7c6e210

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page