Using proxycrawl api to scrape similarweb data
Project description
similarweb_scraper
similarweb_scraperis is a python library for scraping similarweb with proxycrawl api and it can bypass the distil projection so far. It also provides some functionality for transforming scraped data into pd dataframe.
Installation
Use the package manager pip to install foobar.
pip install similarweb-scraper
## Usage
from similarweb_scraper import scraper
### get the website html
web_scrape = scraper()
web_scrape.login(#api key from proxycrawl.com)
web_scrape.webpage_scrape(#websit e.g: hk.yahoo.com)
### get the html code
soup = web_scrape.og_soup
### get the html code as json format
web_json = web_scrape.json_storage
### get data into json format
df = web_scrape.metrics_to_df(#str(metrics_type))
##metrics_type name :
#'country_share',
#'traffic_share',
# engagement',
#'monthly_traffic_data'
# more function will be available soon
Contributing
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
Please make sure to update tests as appropriate.
License
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for similarweb_scraper-0.0.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 53b97147230b96a5c684a718afd96120b1fc1cf2de27211cdcb11cc95c149a01 |
|
MD5 | 6e9c35b4a41c54bacc9eb7670f52f05b |
|
BLAKE2b-256 | da4ee61f460dd477ecf708e24712412c6dd6f4ea6c7511adebf3709de2946867 |