ScrapingBee Python SDK

These details have not been verified by PyPI

Project links

Homepage

Project description

ScrapingBee Python SDK

ScrapingBee is a web scraping API that handles headless browsers and rotates proxies for you. The Python SDK makes it easier to interact with ScrapingBee's API.

Installation

You can install ScrapingBee Python SDK with pip.

pip install scrapingbee

Usage

The ScrapingBee Python SDK is a wrapper around the requests library. ScrapingBee supports GET and POST requests.

Signup to ScrapingBee to get your API key and some free credits to get started.

Making a GET request

>>> from scrapingbee import ScrapingBeeClient

>>> client = ScrapingBeeClient(api_key='REPLACE-WITH-YOUR-API-KEY')

>>> response = client.get(
    'https://www.scrapingbee.com/blog/', 
    params={
        # Block ads on the page you want to scrape	
        'block_ads': False,
        # Block images and CSS on the page you want to scrape	
        'block_resources': True,
        # Premium proxy geolocation
        'country_code': '',
        # Control the device the request will be sent from	
        'device': 'desktop',
        # Use some data extraction rules
        'extract_rules': {'title': 'h1'},
        # Wrap response in JSON
        'json_response': False,
        # Interact with the webpage you want to scrape 
        'js_scenario': {
            "instructions": [
                {"wait_for": "#slow_button"},
                {"click": "#slow_button"},
                {"scroll_x": 1000},
                {"wait": 1000},
                {"scroll_x": 1000},
                {"wait": 1000},            
            ]
        },
        # Use premium proxies to bypass difficult to scrape websites (10-25 credits/request)
        'premium_proxy': False,
        # Execute JavaScript code with a Headless Browser (5 credits/request)
        'render_js': True,
        # Return the original HTML before the JavaScript rendering	
        'return_page_source': False,
        # Return page screenshot as a png image
        'screenshot': False,
        # Take a full page screenshot without the window limitation
        'screenshot_full_page': False,
        # Transparently return the same HTTP code of the page requested.
        'transparent_status_code': False,
        # Wait, in miliseconds, before returning the response
        'wait': 0,
        # Wait for CSS selector before returning the response, ex ".title"
        'wait_for': '',
        # Set the browser window width in pixel
        'window_width': 1920,
        # Set the browser window height in pixel
        'window_height': 1080
    },
    headers={
        # Forward custom headers to the target website
        "key": "value"
    },
    cookies={
        # Forward custom cookies to the target website
        "name": "value"
    }
)
>>> response.text
'<!DOCTYPE html><html lang="en"><head>...'

ScrapingBee takes various parameters to render JavaScript, execute a custom JavaScript script, use a premium proxy from a specific geolocation and more.

You can find all the supported parameters on ScrapingBee's documentation.

You can send custom cookies and headers like you would normally do with the requests library.

Screenshot

Here a little exemple on how to retrieve and store a screenshot from the ScrapingBee blog in its mobile resolution.

>>> from scrapingbee import ScrapingBeeClient

>>> client = ScrapingBeeClient(api_key='REPLACE-WITH-YOUR-API-KEY')

>>> response = client.get(
    'https://www.scrapingbee.com/blog/', 
    params={
        # Take a screenshot
        'screenshot': True,
        # Specify that we need the full height
        'screenshot_full_page': True,
        # Specify a mobile width in pixel
        'window_width': 375
    }
)

>>> if response.ok:
        with open("./scrapingbee_mobile.png", "wb") as f:
            f.write(response.content)

Using ScrapingBee with Scrapy

Scrapy is the most popular Python web scraping framework. You can easily integrate ScrapingBee's API with the Scrapy middleware.

Retries

The client includes a retry mechanism for 5XX responses.

>>> from scrapingbee import ScrapingBeeClient

>>> client = ScrapingBeeClient(api_key='REPLACE-WITH-YOUR-API-KEY')

>>> response = client.get(
    'https://www.scrapingbee.com/blog/', 
    params={
        'render_js': True,
    },
    retries=5
)

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

2.0.1

Oct 17, 2023

2.0.0

Oct 3, 2023

1.2.0

Mar 20, 2023

1.1.8

Aug 10, 2022

1.1.7

Nov 22, 2021

1.1.6

Jul 1, 2021

1.1.5

Jun 10, 2021

1.1.4

Jun 9, 2021

1.1.3

Jun 8, 2021

1.1.1

Jun 8, 2021

1.1.0

Jun 7, 2021

1.0.1

Jun 7, 2021

0.0.1

Jun 7, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapingbee-2.0.1.tar.gz (4.7 kB view details)

Uploaded Oct 17, 2023 Source

Built Distribution

scrapingbee-2.0.1-py3-none-any.whl (5.2 kB view details)

Uploaded Oct 17, 2023 Python 3

File details

Details for the file scrapingbee-2.0.1.tar.gz.

File metadata

Download URL: scrapingbee-2.0.1.tar.gz
Upload date: Oct 17, 2023
Size: 4.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for scrapingbee-2.0.1.tar.gz
Algorithm	Hash digest
SHA256	`b33622d4c6111c0ae454fa23398a43dccf7b7e126cc44ac498d3411a8514069f`
MD5	`4e55bbc1097e0a0a2bd424e9c4d88120`
BLAKE2b-256	`f84f55ca54247c0c124b898ce428027f5cddd39c1377cb27e263edbbda85d348`

See more details on using hashes here.

File details

Details for the file scrapingbee-2.0.1-py3-none-any.whl.

File metadata

Download URL: scrapingbee-2.0.1-py3-none-any.whl
Upload date: Oct 17, 2023
Size: 5.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for scrapingbee-2.0.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`55c7eca71f5be891718795750b3149a2aead1458b0ae6716fdefeb6cf8e81e82`
MD5	`3ba294de1eb887d3a5a1a63fed19e5f9`
BLAKE2b-256	`b632c995c1d670f05e4879b5ba23a6d1690a930a519809111504050b889cf715`

See more details on using hashes here.

scrapingbee 2.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

ScrapingBee Python SDK

Installation

Usage

Making a GET request

Screenshot

Using ScrapingBee with Scrapy

Retries

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes