Web Scraper
Project description
# default-scraper
Python Web Scraper
## Features
Scrap all search results for a keyword entered as an argument.
Can be saved as .csv and .json.
Also collect user data who uploaded contents included in search results.
## Usage
### Install
`bash pip install git+https://github.com/Seongbuming/crawler.git `
### Scrap Instagram contents in python script
`python from default_scraper.instagram.parser import InstagramParser USERNAME = "" PASSWORD = "" KEYWORD = "" parser = InstagramParser(USERNAME, PASSWORD, KEYWORD, False) parser.run() `
### Scrap Instagram contents using bash command
Run following command to scrap contents from Instagram:
`bash python main.py --platform instagram --keyword {KEYWORD} [--output_file OUTPUT_FILE] [--all] `
Use –all or -a option to also scrap unstructured fields.
## Data description
Structured fields - pk - id - taken_at - media_type - code - comment_count - user - like_count - caption - accessibility_caption - original_width - original_height - images
Some fields may be missing depending on Instagram’s response data.
## Future works
Will support scraping from more platform services.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
Hashes for default_scraper-1.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2647e5a403fc71f0e669073eae4412c401105eec9b2568abe6697384fefe3c28 |
|
MD5 | 6d4661289dc9e4cb0ea6ae0b647c2d48 |
|
BLAKE2b-256 | 068ca487781a03f63adf8962f4cc215f268f704355d6ad2ee88002cb31d93dd6 |