Web Scraper
Project description
# default-scraper
Python Web Scraper
## Features
Scrap all search results for a keyword entered as an argument.
Can be saved as .csv and .json.
Also collect user data who uploaded contents included in search results.
## Usage
### Install
`bash pip install default-scraper `
or
`bash pip install git+https://github.com/Seongbuming/crawler.git `
### Scrap Instagram contents in python script
`python from default_scraper.instagram.parser import InstagramParser USERNAME = "" PASSWORD = "" KEYWORD = "" parser = InstagramParser(USERNAME, PASSWORD, KEYWORD, False) parser.run() `
### Scrap Instagram contents using bash command
Run following command to scrap contents from Instagram:
`bash python main.py --platform instagram --keyword {KEYWORD} [--output_file OUTPUT_FILE] [--all] `
Use –all or -a option to also scrap unstructured fields.
## Data description
Structured fields - pk - id - taken_at - media_type - code - comment_count - user - like_count - caption - accessibility_caption - original_width - original_height - images
Some fields may be missing depending on Instagram’s response data.
## Future works
Will support scraping from more platform services.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file default-scraper-1.1.1.tar.gz
.
File metadata
- Download URL: default-scraper-1.1.1.tar.gz
- Upload date:
- Size: 7.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.7.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 46b571959ab4c15d89aa92e70e97ecd846f0c4722d883fe4ee2d233a0934f0b7 |
|
MD5 | afe0ee78c30f2d197518b4ebb5910768 |
|
BLAKE2b-256 | 9d0f0c3fcaf79384d1d2014dd0103919a7c3729ab2c0b4daa216817346b69109 |
File details
Details for the file default_scraper-1.1.1-py3-none-any.whl
.
File metadata
- Download URL: default_scraper-1.1.1-py3-none-any.whl
- Upload date:
- Size: 7.7 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.7.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8e8d24e061aad5dd32a99022336b0e562174980f41b62d15a8a8ad77ef97bc7e |
|
MD5 | 0dcc5df4714591c2cd45ffffb86aded9 |
|
BLAKE2b-256 | 8029834e4a1355e810adf328373ff7717f1b9e2dbc8150b26e30d246299c1b29 |