Skip to main content

A web spider to crawl public github repositories

Project description

Github Crawler

A web spider to crawl public github repositories

Built with ❤︎ and :coffee: by Karthik Hosur


A web spider to crawl public github repositories to collect data of github user profiles,repositories and user social counts for educational purpose only. The project was earlier built to collect data from github for academic data analysis project.

Features

  • Extract User Social Status
  • Extract Repository Names
  • Extract User Activity
  • Extract Repository Social Information
  • Extract Repository Data

Installation

  • You can install this package using
pip install github-crawler

Usage

Extract the profile information of a github user

  • Import it in your Python project
from github_crawler import user_profile

github_crawler.user_profile("karthikhosur") # Use the username of the user

Result

The module would return a dictionary with result as follows:

{'followers': 2,
 'following': 4,
 'stars': 5,
 'repositories': ['/karthikhosur/Github-Crawler',
  '/karthikhosur/FairCV',
  '/karthikhosur/Stop-Words-Remover-API',
  '/karthikhosur/BlogsParser-API',

Extract a Repository information

  • Import it in your Python project
from github_crawler import repo_info

github_crawler.repo_info("/karthikhosur/Github-Crawler")# Use the username with the repository name in the format given

Result

The module would return a dictionary with result as follows:

{'watchers': 1, 'stargazers': 1, 'forks': 1}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

github_crawler-0.0.6.tar.gz (2.8 kB view details)

Uploaded Source

Built Distribution

github_crawler-0.0.6-py3-none-any.whl (6.3 kB view details)

Uploaded Python 3

File details

Details for the file github_crawler-0.0.6.tar.gz.

File metadata

  • Download URL: github_crawler-0.0.6.tar.gz
  • Upload date:
  • Size: 2.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.7.3

File hashes

Hashes for github_crawler-0.0.6.tar.gz
Algorithm Hash digest
SHA256 e49470425c12c3af2ab57c00734951e570748cc141510da7b96cad8817275213
MD5 4079126f16554fe9c7b2792e5bc114c8
BLAKE2b-256 e21dbf289ce15ada1e868cd39e434e7d66b26cf65eb3a0f3c1fd4588e2462ec9

See more details on using hashes here.

File details

Details for the file github_crawler-0.0.6-py3-none-any.whl.

File metadata

  • Download URL: github_crawler-0.0.6-py3-none-any.whl
  • Upload date:
  • Size: 6.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/52.0.0 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.7.3

File hashes

Hashes for github_crawler-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 40a22837fb3e6c762edd23698cf5a5f7776770e1c6829cd7732ee44ac0cc6c9b
MD5 696ef7e7cfeea28884a34b7c02308a6c
BLAKE2b-256 4b7fef4378ae311aad8bb30e285e260e1cb1fe5ec1163f6b1212a612ecaabb60

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page