Fast asynchronous web scraper with minimalist API.
Project description
Hypotonic
Fast asynchronous web scraper with minimalist API inspired by awesome node-osmosis.
Hypotonic provides SQLAlchemy-like command chaining DSL to define HTML scrapers. Everything is executed asynchronously via asyncio
and all dependencies are pure Python. Supports querying by CSS selectors with Scrapy's pseudo-attributes. XPath is not supported due to libxml
requirement.
Hypotonic does not natively execute JavaScript on websites and it is recommended to use prerender.
Installing
Hypotonic requires Python 3.6+.
pip install hypotonic
Example
from hypotonic import Hypotonic
data, errors = (
Hypotonic()
.get('http://books.toscrape.com/')
.paginate('.next a::attr(href)', 5)
.find('.product_pod h3')
.set('title')
.follow('a::attr(href)')
.set({'price': '.price_color',
'availability': 'p.availability'})
.data()
)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
hypotonic-0.0.9.tar.gz
(4.8 kB
view hashes)
Built Distribution
Close
Hashes for hypotonic-0.0.9-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 406969209f9dd7400f2e63c2825687998ad0c42b4474c47da77ffef8e8913bdc |
|
MD5 | 7f3263b2f13ec9bafe78d5d2e9d67a8b |
|
BLAKE2b-256 | 1ea02a16bff5b76049a7b153fa91fd6b1ac7c487bf95f3bc3e8768de4b417e26 |