Skip to main content

A library to read a YML file with Xpath or CSS Selectors and extract data from HTML pages using them

Project description

selectorlib

https://img.shields.io/pypi/v/selectorlib.svg https://img.shields.io/travis/scrapehero/selectorlib.svg Documentation Status Updates

A library to read a YML file with Xpath or CSS Selectors and extract data from HTML pages using them

Example

>>> from selectorlib import Extractor
>>> yaml_string = """
    title:
        css: "h1"
        type: Text
    link:
        css: "h2 a"
        type: Link
    """
>>> extractor = Extractor.from_yaml_string(yaml_string)
>>> html = """
    <h1>Title</h1>
    <h2>Usage
        <a class="headerlink" href="http://test">¶</a>
    </h2>
    """
>>> extractor.extract(html)
{'title': 'Title', 'link': 'http://test'}

History

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

selectorlib-0.15.0.tar.gz (188.8 kB view details)

Uploaded Source

Built Distribution

selectorlib-0.15.0-py2.py3-none-any.whl (5.8 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file selectorlib-0.15.0.tar.gz.

File metadata

  • Download URL: selectorlib-0.15.0.tar.gz
  • Upload date:
  • Size: 188.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.1

File hashes

Hashes for selectorlib-0.15.0.tar.gz
Algorithm Hash digest
SHA256 c10187488f14a29818308f472f641f9688860358bad9a1703d984fb925431382
MD5 2213666f646e724d90dbb16886331c68
BLAKE2b-256 2b2cfd53fb183d988f43c473caddc1edb32cf0ffd0d344e83b2038839bfdf4cc

See more details on using hashes here.

File details

Details for the file selectorlib-0.15.0-py2.py3-none-any.whl.

File metadata

  • Download URL: selectorlib-0.15.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 5.8 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.22.0 setuptools/44.0.0 requests-toolbelt/0.9.1 tqdm/4.41.1 CPython/3.7.1

File hashes

Hashes for selectorlib-0.15.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 f88288ef0c89043f48045d80e056ab55ada715c26673ae6b3c58dbe0c3e17f47
MD5 edf65754ed489e671dc2697144000bd3
BLAKE2b-256 280c81f6cd45139574460585a6effcc276cb37fc0fb55686d095bce3eaab03a9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page