Scrapfly SDK for Scrapfly
Project description
Scrapfly SDK
Installation
pip install scrapfly-sdk
You can also install extra dependencies
pip install "scrapfly-sdk[seepdup]"
for performance improvementpip install "scrapfly-sdk[concurrency]"
for concurrency out of the box (asyncio / thread)pip install "scrapfly-sdk[scrapy]"
for scrapy integrationpip install "scrapfly-sdk[scrapy]"
Everything!
Get Your API Key
You can create a free account on Scrapfly to get your API Key.
Migration
Migrate from 0.7.x to 0.8
asyncio-pool dependency has been dropped
scrapfly.concurrent_scrape
is now an async generator. If the concurrency is None
or not defined, the max concurrency allowed by
your current subscription is used.
async for result in scrapfly.concurrent_scrape(concurrency=10, scrape_configs=[ScrapConfig(...), ...]):
print(result)
brotli args is deprecated and will be removed in the next minor. There is not benefit in most of case versus gzip regarding and size and use more CPU.
What's new
0.8.x
- Better error log
- Async/Improvement for concurrent scrape with asyncio
- Scrapy media pipeline are now supported out of the box
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
scrapfly-sdk-0.8.0.tar.gz
(23.9 kB
view hashes)
Built Distribution
Close
Hashes for scrapfly_sdk-0.8.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1417bed2965dca17a40583871eb74ec79fea17a4b04cf6a06d22635800b8a61b |
|
MD5 | ff314507c84fb7d7a47fbd560a6af6f4 |
|
BLAKE2b-256 | 4817b47a94e4a1db67738eee86c15d60bca938e87458b8fed192c625fa31bb17 |