Web Scraper
Project description
# default-scraper
Python Web Scraper
## Features
Scrap all search results for a keyword entered as an argument.
Can be saved as .csv and .json.
Also collect user data who uploaded contents included in search results.
## Usage
### Install
`bash pip install default-scraper `
or
`bash pip install git+https://github.com/Seongbuming/crawler.git `
### Scrap Instagram contents in python script
`python from default_scraper.instagram.parser import InstagramParser USERNAME = "" PASSWORD = "" KEYWORD = "" parser = InstagramParser(USERNAME, PASSWORD, KEYWORD, False) parser.run() `
### Scrap Instagram contents using bash command
Run following command to scrap contents from Instagram:
`bash python main.py --platform instagram --keyword {KEYWORD} [--output_file OUTPUT_FILE] [--all] `
Use –all or -a option to also scrap unstructured fields.
## Data description
Structured fields - pk - id - taken_at - media_type - code - comment_count - user - like_count - caption - accessibility_caption - original_width - original_height - images
Some fields may be missing depending on Instagram’s response data.
## Future works
Will support scraping from more platform services.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for default_scraper-1.1.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8e8d24e061aad5dd32a99022336b0e562174980f41b62d15a8a8ad77ef97bc7e |
|
MD5 | 0dcc5df4714591c2cd45ffffb86aded9 |
|
BLAKE2b-256 | 8029834e4a1355e810adf328373ff7717f1b9e2dbc8150b26e30d246299c1b29 |