Skip to main content

Web Scraper

Project description

# default-scraper

Python Web Scraper

## Features

  • Scrap all search results for a keyword entered as an argument.

  • Can be saved as .csv and .json.

  • Also collect user data who uploaded contents included in search results.

## Usage

### Install

`bash pip install default-scraper `

or

`bash pip install git+https://github.com/Seongbuming/crawler.git `

### Scrap Instagram contents in python script

`python from default_scraper.instagram.parser import InstagramParser USERNAME = "" PASSWORD = "" KEYWORD = "" parser = InstagramParser(USERNAME, PASSWORD, KEYWORD, False) parser.run() `

### Scrap Instagram contents using bash command

Run following command to scrap contents from Instagram:

`bash python main.py --platform instagram --keyword {KEYWORD} [--output_file OUTPUT_FILE] [--all] `

Use –all or -a option to also scrap unstructured fields.

## Data description

### Instagram

  • Structured fields - pk - id - taken_at - media_type - code - comment_count - user - like_count - caption - accessibility_caption - original_width - original_height - images

  • Some fields may be missing depending on Instagram’s response data.

## Future works

  • Will support scraping from more platform services.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

default-scraper-1.1.1.tar.gz (7.3 kB view details)

Uploaded Source

Built Distribution

default_scraper-1.1.1-py3-none-any.whl (7.7 MB view details)

Uploaded Python 3

File details

Details for the file default-scraper-1.1.1.tar.gz.

File metadata

  • Download URL: default-scraper-1.1.1.tar.gz
  • Upload date:
  • Size: 7.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.7.9

File hashes

Hashes for default-scraper-1.1.1.tar.gz
Algorithm Hash digest
SHA256 46b571959ab4c15d89aa92e70e97ecd846f0c4722d883fe4ee2d233a0934f0b7
MD5 afe0ee78c30f2d197518b4ebb5910768
BLAKE2b-256 9d0f0c3fcaf79384d1d2014dd0103919a7c3729ab2c0b4daa216817346b69109

See more details on using hashes here.

File details

Details for the file default_scraper-1.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for default_scraper-1.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8e8d24e061aad5dd32a99022336b0e562174980f41b62d15a8a8ad77ef97bc7e
MD5 0dcc5df4714591c2cd45ffffb86aded9
BLAKE2b-256 8029834e4a1355e810adf328373ff7717f1b9e2dbc8150b26e30d246299c1b29

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page