Scrapy middleware for Scraping.link
Project description
Scrapy ScrapingLink Middleware
Acknowledgements
Thanks to arimbr and ScrapingBee, this is adaptation of their work.
Installation
pip install scrapy-scraping-link
Configuration
Add your ScrapingLink_API_KEY and the ScrapingLinkMiddleware to your project settings.py. Don't forget to set CONCURRENT_REQUESTS according to your ScrapingLink plan.
SCRAPINGLINK_API_KEY = 'REPLACE-WITH-YOUR-API-KEY'
DOWNLOADER_MIDDLEWARES = {
'scrapy_ScrapingLink.ScrapingLinkMiddleware': 700,
}
CONCURRENT_REQUESTS = 1
Usage
Inherit your spiders from ScrapingLinkSpider and yield a ScrapingLinkRequest.
Below you can see an example from the spider in httpbin.py.
from scrapy import Spider
from scrapy_scraping_link import ScrapingLinkSpider, ScrapingLinkRequest
class HttpbinSpider(Spider):
name = 'httpbin'
start_urls = [
'https://httpbin.org',
]
def start_requests(self):
for url in self.start_urls:
yield ScrapingLinkRequest(url, params={
# 'render': False,
},
headers={
# 'Accept-Language': 'En-US',
},
cookies={
# 'name_1': 'value_1',
})
def parse(self, response):
...
You can pass ScrapingLink parameters in the params argument of a ScrapingLinkRequest. Headers and cookies are passed like a normal Scrapy Request. ScrapingLinkRequests formats all parameters, headers and cookies to the format expected by the API.
Examples
Add your API key to settings.py.
To run the examples you need to clone this repository. In your terminal, go to examples/httpbin/httpbin and run the example spider with:
scrapy crawl httpbin
Customer Support
Simply reach out to us via Telegram Group or or write us an email.
Sign up for our free plan to get a free API key loaded with 100 free credits. No credit card required!
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file scrapy-scraping-link-0.0.4.tar.gz.
File metadata
- Download URL: scrapy-scraping-link-0.0.4.tar.gz
- Upload date:
- Size: 4.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.57.0 CPython/3.6.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f3f576b4a12ee4e106ed23ca22cc4fdef9d21b99b9af473cc21330344f18a828
|
|
| MD5 |
1b57ca5fe984e3fcc892c65ee337ec1b
|
|
| BLAKE2b-256 |
0478b167c4d0567dd504662e2511013138de3456e671c96af1b98c8343e867f8
|