Skip to main content

Scraping GitHub repository

Project description

reposcraping

Scraping GitHub repository

This library allows you to access the names of files and folders of any GitHub repository. Cloner class allows you to clone the file types you want to the path you want.

setup

pip install reposcraping

usage

from reposcraping import RepoScraping

from reposcraping.cloner import Cloner



scraping = RepoScraping(

    "https://github.com/emresvd/random-video",

    p=True,

)



print(scraping.tree_urls)

print(scraping.file_urls)



cloner = Cloner(scraping)

cloner.clone(

    paths={

        ".py": "files/python_files",

        ".txt": "files/text_files",

        ".md": "files/markdown_files",

        ".html": "files/html_files",

    },

    only_file_name=True,

    p=True,

)

output

['https://github.com/emresvd/random-video', 'https://github.com/emresvd/random-video/tree/master/random_video', 'https://github.com/emresvd/random-video/tree/master/special_search', 'https://github.com/emresvd/random-video/tree/master/static', 'https://github.com/emresvd/random-video/tree/master/templates', 'https://github.com/emresvd/random-video/tree/master/video', 'https://github.com/emresvd/random-video/tree/master/random_video/__pycache__', 'https://github.com/emresvd/random-video/tree/master/video/__pycache__', 'https://github.com/emresvd/random-video/tree/master/video/migrations', 'https://github.com/emresvd/random-video/tree/master/video/migrations/__pycache__']

['https://github.com/emresvd/random-video/blob/master/LICENSE.md', 'https://github.com/emresvd/random-video/blob/master/.gitignore', 'https://github.com/emresvd/random-video/blob/master/README.md', 'https://github.com/emresvd/random-video/blob/master/db.sqlite3', 'https://github.com/emresvd/random-video/blob/master/manage.py', 'https://github.com/emresvd/random-video/blob/master/requirements.txt', 'https://github.com/emresvd/random-video/blob/master/words.txt', 'https://github.com/emresvd/random-video/blob/master/static/ic_launcher-playstore.png', 'https://github.com/emresvd/random-video/blob/master/random_video/__init__.py', 'https://github.com/emresvd/random-video/blob/master/random_video/asgi.py', 'https://github.com/emresvd/random-video/blob/master/random_video/settings.py', 'https://github.com/emresvd/random-video/blob/master/random_video/urls.py', 'https://github.com/emresvd/random-video/blob/master/random_video/wsgi.py', 'https://github.com/emresvd/random-video/blob/master/special_search/car.txt', 'https://github.com/emresvd/random-video/blob/master/special_search/food.txt', 'https://github.com/emresvd/random-video/blob/master/special_search/rocket.txt', 'https://github.com/emresvd/random-video/blob/master/special_search/space.txt', 'https://github.com/emresvd/random-video/blob/master/special_search/travel.txt', 'https://github.com/emresvd/random-video/blob/master/static/favicon.ico', 'https://github.com/emresvd/random-video/blob/master/templates/download.html', 'https://github.com/emresvd/random-video/blob/master/templates/index.html', 'https://github.com/emresvd/random-video/blob/master/video/__init__.py', 'https://github.com/emresvd/random-video/blob/master/video/admin.py', 'https://github.com/emresvd/random-video/blob/master/video/apps.py', 'https://github.com/emresvd/random-video/blob/master/video/models.py', 'https://github.com/emresvd/random-video/blob/master/video/random_video.py', 'https://github.com/emresvd/random-video/blob/master/video/tests.py', 'https://github.com/emresvd/random-video/blob/master/video/views.py', 'https://github.com/emresvd/random-video/blob/master/random_video/__pycache__/__init__.cpython-37.pyc', 'https://github.com/emresvd/random-video/blob/master/random_video/__pycache__/settings.cpython-37.pyc', 'https://github.com/emresvd/random-video/blob/master/random_video/__pycache__/urls.cpython-37.pyc', 'https://github.com/emresvd/random-video/blob/master/random_video/__pycache__/wsgi.cpython-37.pyc', 'https://github.com/emresvd/random-video/blob/master/video/__pycache__/__init__.cpython-37.pyc', 'https://github.com/emresvd/random-video/blob/master/video/__pycache__/admin.cpython-37.pyc', 'https://github.com/emresvd/random-video/blob/master/video/__pycache__/models.cpython-37.pyc', 'https://github.com/emresvd/random-video/blob/master/video/__pycache__/random_video.cpython-37.pyc', 'https://github.com/emresvd/random-video/blob/master/video/__pycache__/views.cpython-37.pyc', 'https://github.com/emresvd/random-video/blob/master/video/migrations/__init__.py', 'https://github.com/emresvd/random-video/blob/master/video/migrations/__pycache__/__init__.cpython-37.pyc']

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

reposcraping-1.0.9.tar.gz (3.8 kB view details)

Uploaded Source

Built Distribution

reposcraping-1.0.9-py3-none-any.whl (4.5 kB view details)

Uploaded Python 3

File details

Details for the file reposcraping-1.0.9.tar.gz.

File metadata

  • Download URL: reposcraping-1.0.9.tar.gz
  • Upload date:
  • Size: 3.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.7.8

File hashes

Hashes for reposcraping-1.0.9.tar.gz
Algorithm Hash digest
SHA256 554bbf90c36e3c209c1a82be55f464e7c2fb61cd7094ec8e18052f6414dcab6b
MD5 2bd268157b1a6b1ff52f1dbc8508f039
BLAKE2b-256 f510cbe680d2c69155cf7a3c33bb36de10f5c0ad677e12e354e79c63564c971e

See more details on using hashes here.

Provenance

File details

Details for the file reposcraping-1.0.9-py3-none-any.whl.

File metadata

File hashes

Hashes for reposcraping-1.0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 c65b77467d3b239dae806c2cf089c4fb41c695700b03ad92f9aa7c5b3129a778
MD5 b05645c8ba9c1214ed93f689de0673de
BLAKE2b-256 85b150e921c6e3c668d5217ea4987fdb9c3cb936f62ce8f70146f5f6ddadea78

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page