Python wrapper for Prompt API's Scraper API

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Project description

Python Version

Prompt API - Scraper API - Python Package

pa-scraper is a python wrapper for scraper api with few more extra cream and sugar.

Requirements

You need to signup for Prompt API
You need to subscribe scraper api, test drive is free!!!
You need to set PROMPTAPI_TOKEN environment variable after subscription.

then;

$ pip install pa-scraper

Example Usage

Examples can be found here.

from scraper import Scraper

url = 'https://pypi.org/classifiers/'
scraper = Scraper(url)
response = scraper.get()

if response.get('error', None):
    # response['error']  returns error message
    # response['status'] returns http status code
    # {'error': 'Not Found', 'status': 404}
    print(response)
else:
    result = response['result']

    print(result['headers'])   # returns response headers 
    print(result['data'])      # returns fetched html
    print(result['url'])       # returns fetched url
    print(response['status'])  # returns http status code

    save_result = scraper.save('/tmp/my-html.html')  # save to file
    if save_result.get('error', None):
        # we have save error
        pass
    else:
        print(save_result)    # contains saved file path and file size
        # {'file': '/tmp/my-html.html', 'size': 321322}

You can add url parameters for extra operations. Valid parameters are:

auth_password: for HTTP Realm auth password
auth_username: for HTTP Realm auth username
cookie: URL Encoded cookie header.
country: 2 character country code. If you wish to scrape from an IP address of a specific country.
referer: HTTP referer header

from scraper import Scraper

    url = 'https://pypi.org/classifiers/'
    scraper = Scraper(url)

    fetch_params = dict(country='EE')
    response = scraper.get(params=fetch_params)

    if response.get('error', None):
        # response['error']  returns error message
        # response['status'] returns http status code
        # {'error': 'Not Found', 'status': 404}
        print(response)
    else:
        result = response['result']
        status = response['status']

        print(result['headers'])   # returns response headers 
        print(result['data'])      # returns fetched html
        print(result['url'])       # returns fetched url
        print(response['status'])  # returns http status code

        save_result = scraper.save('/tmp/my-html.html')  # save to file
        if save_result.get('error', None):
            # we have save error
            pass
        else:
            print(save_result)    # contains saved file path and file size
            # {'file': '/tmp/my-html.html', 'size': 321322}

TODO

Add xpath extractor.

License

This project is licensed under MIT

Contributer(s)

Prompt API - Creator, maintainer

Contribute

All PR’s are welcome!

fork (https://github.com/promptapi/scraper-py/fork)
Create your branch (git checkout -b my-feature)
commit yours (git commit -am 'Add awesome features...')
push your branch (git push origin my-feature)
Than create a new Pull Request!

This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the code of conduct.

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

View statistics for this project via Libraries.io, or by using our public dataset on Google BigQuery

Release history Release notifications | RSS feed

0.2.4

Oct 6, 2020

0.2.3

Sep 14, 2020

0.2.2

Sep 7, 2020

0.2.1

Sep 7, 2020

0.2.0

Sep 7, 2020

This version

0.1.2

Sep 3, 2020

0.1.1

Sep 3, 2020

0.1.0

Sep 3, 2020

0.0.0

Sep 1, 2020

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pa-scraper-0.1.2.tar.gz (4.9 kB view hashes)

Uploaded Sep 3, 2020 Source

Built Distribution

pa_scraper-0.1.2-py3-none-any.whl (5.5 kB view hashes)

Uploaded Sep 3, 2020 Python 3

Hashes for pa-scraper-0.1.2.tar.gz

Hashes for pa-scraper-0.1.2.tar.gz
Algorithm	Hash digest
SHA256	`2cb449ade5d2c68774805a1753360b5325bff6c7decd6275a3672b6dc29f0898`
MD5	`f8ef835346e1a1a0f977b79fb42a791d`
BLAKE2b-256	`7da0cfb5f099cb93994f6a5c03fea32f40ef7641301e59a15330275aa4ad7214`

Hashes for pa_scraper-0.1.2-py3-none-any.whl

Hashes for pa_scraper-0.1.2-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2d1212afa74c3ce0fc06298ff18453b701925ae2886b73a80dc6b32d9bba8de6`
MD5	`0e98ff37b92f1ca10542a77267454b23`
BLAKE2b-256	`1180f166bfaf52dd7b9e56c88d43145802a480b2b7f79d525d14df2b8d554a13`