Skip to main content

A library to read a YML file with Xpath or CSS Selectors and extract data from HTML pages using them

Project description

selectorlib

https://img.shields.io/pypi/v/selectorlib.svg https://img.shields.io/travis/scrapehero/selectorlib.svg Documentation Status Updates

A library to read a YML file with Xpath or CSS Selectors and extract data from HTML pages using them

Example

>>> from selectorlib import Extractor
>>> yaml_string = """
    title:
        css: "h1"
        type: Text
    link:
        css: "h2 a"
        type: Link
    """
>>> extractor = Extractor.from_yaml_string(yaml_string)
>>> html = """
    <h1>Title</h1>
    <h2>Usage
        <a class="headerlink" href="http://test">¶</a>
    </h2>
    """
>>> extractor.extract(html)
{'title': 'Title', 'link': 'http://test'}

History

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

selectorlib-0.12.0.tar.gz (188.7 kB view details)

Uploaded Source

Built Distribution

selectorlib-0.12.0-py2.py3-none-any.whl (5.8 kB view details)

Uploaded Python 2Python 3

File details

Details for the file selectorlib-0.12.0.tar.gz.

File metadata

  • Download URL: selectorlib-0.12.0.tar.gz
  • Upload date:
  • Size: 188.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.1

File hashes

Hashes for selectorlib-0.12.0.tar.gz
Algorithm Hash digest
SHA256 b7cc3d4a5ed31700c32ffa8e5a5c24b8c3c143f1938550e1420eae016875d96f
MD5 6df07870952c974657c58b2841ec8c55
BLAKE2b-256 7f7538cc8f9840350e9a9953a287bcc0b3d65c8b0c72aaaf8d0d38cb2b5ec1e3

See more details on using hashes here.

File details

Details for the file selectorlib-0.12.0-py2.py3-none-any.whl.

File metadata

  • Download URL: selectorlib-0.12.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 5.8 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.7.1

File hashes

Hashes for selectorlib-0.12.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 6ddb83c81fea586c2d6263b2f4c82f279f9077f372b0eb027d567113f524a94d
MD5 9cab5ae6f547654fcaa3d4ec3ed41e0b
BLAKE2b-256 df692e72802ebe4f8d19cf09730b40f23ec80ad136112f19a66ae80ca5307087

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page