Skip to main content

Using proxycrawl api to scrape similarweb data

Project description

similarweb_scraper

similarweb_scraperis is a python library for scraping similarweb with proxycrawl api and it can bypass the distil projection so far. It also provides some functionality for transforming scraped data into pd dataframe.

Installation

Use the package manager pip to install foobar.

pip install similarweb-scraper

## Usage

from similarweb_scraper import scraper

### get the website html
web_scrape = scraper()
web_scrape.login(#api key from proxycrawl.com)
web_scrape.webpage_scrape(#websit e.g: hk.yahoo.com)

### get the html code
soup = web_scrape.og_soup
### get the html code as json format
web_json = web_scrape.json_storage

### get data into json format
df = web_scrape.metrics_to_df(#str(metrics_type))
##metrics_type name :
#'country_share',
#'traffic_share',
# engagement',
#'monthly_traffic_data'
# more function will be available soon

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

similarweb_scraper-0.0.3.tar.gz (3.9 kB view details)

Uploaded Source

Built Distribution

similarweb_scraper-0.0.3-py3-none-any.whl (5.5 kB view details)

Uploaded Python 3

File details

Details for the file similarweb_scraper-0.0.3.tar.gz.

File metadata

  • Download URL: similarweb_scraper-0.0.3.tar.gz
  • Upload date:
  • Size: 3.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.2

File hashes

Hashes for similarweb_scraper-0.0.3.tar.gz
Algorithm Hash digest
SHA256 3e9d46c8e16f5f71eb872df997eb9e9687c95dd4aa4bcf731ebefb15286302f3
MD5 ffc3204289dc78199c8a4fedd5728e68
BLAKE2b-256 1a5449810b1d07df0451749235afba64f541c2b5481febc7cd8a95ade2f9fac7

See more details on using hashes here.

File details

Details for the file similarweb_scraper-0.0.3-py3-none-any.whl.

File metadata

  • Download URL: similarweb_scraper-0.0.3-py3-none-any.whl
  • Upload date:
  • Size: 5.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.7.2

File hashes

Hashes for similarweb_scraper-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 53b97147230b96a5c684a718afd96120b1fc1cf2de27211cdcb11cc95c149a01
MD5 6e9c35b4a41c54bacc9eb7670f52f05b
BLAKE2b-256 da4ee61f460dd477ecf708e24712412c6dd6f4ea6c7511adebf3709de2946867

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page