Skip to main content

Fast asynchronous web scraper with minimalist API.

Project description

Hypotonic

Fast asynchronous web scraper with minimalist API inspired by awesome node-osmosis.

Hypotonic provides SQLAlchemy-like command chaining DSL to define HTML scrapers. Everything is executed asynchronously via asyncio and all dependencies are pure Python. Supports querying by CSS selectors with Scrapy's pseudo-attributes. XPath is not supported due to libxml requirement.

Hypotonic does not natively execute JavaScript on websites and it is recommended to use prerender.

Installing

Hypotonic requires Python 3.6+.

pip install hypotonic

Example

from hypotonic import Hypotonic

data, errors = (
  Hypotonic()
    .get('http://books.toscrape.com/')
    .paginate('.next a::attr(href)', 5)
    .find('.product_pod h3')
    .set('title')
    .follow('a::attr(href)')
    .set({'price': '.price_color',
          'availability': 'p.availability'})
    .data()
)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hypotonic-0.0.15.tar.gz (5.9 kB view hashes)

Uploaded source

Built Distribution

hypotonic-0.0.15-py3-none-any.whl (6.5 kB view hashes)

Uploaded py3

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page