Skip to main content

Parsel is a library to extract data from HTML and XML using XPath and CSS selectors

Project description

This project is a branch of parsel on QPython.

Parsel is a BSD-licensed Python_ library to extract data from HTML_, JSON_, and XML_ documents.

It supports:

  • CSS_ and XPath_ expressions for HTML and XML documents

  • JMESPath_ expressions for JSON documents

  • Regular expressions_

Find the Parsel online documentation at https://parsel.readthedocs.org.

Example (open online demo_):

    >>> from parsel import Selector
    >>> text = """
            <html>
                <body>
                    <h1>Hello, Parsel!</h1>
                    <ul>
                        <li><a href="http://example.com">Link 1</a></li>
                        <li><a href="http://scrapy.org">Link 2</a></li>
                    </ul>
                    <script type="application/json">{"a": ["b", "c"]}</script>
                </body>
            </html>"""
    >>> selector = Selector(text=text)
    >>> selector.css('h1::text').get()
    'Hello, Parsel!'
    >>> selector.xpath('//h1/text()').re(r'\w+')
    ['Hello', 'Parsel']
    >>> for li in selector.css('ul > li'):
    ...     print(li.xpath('.//@href').get())
    http://example.com
    http://scrapy.org
    >>> selector.css('script::text').jmespath("a").get()
    'b'
    >>> selector.css('script::text').jmespath("a").getall()
    ['b', 'c']

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

parsel_qpython-1.9.1.2.tar.gz (12.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

parsel_qpython-1.9.1.2-py3-none-any.whl (13.1 kB view details)

Uploaded Python 3

File details

Details for the file parsel_qpython-1.9.1.2.tar.gz.

File metadata

  • Download URL: parsel_qpython-1.9.1.2.tar.gz
  • Upload date:
  • Size: 12.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.11.7

File hashes

Hashes for parsel_qpython-1.9.1.2.tar.gz
Algorithm Hash digest
SHA256 2758290b89f103146a4498a9b2bb9bcb527f677c1e7af7da6140b297b96db35a
MD5 cef8f28b1eb0ece49e27290168141b2f
BLAKE2b-256 7afe2feb0c36b8dd841340871de963b31cccb86064f66ca82b6de4583931c223

See more details on using hashes here.

File details

Details for the file parsel_qpython-1.9.1.2-py3-none-any.whl.

File metadata

File hashes

Hashes for parsel_qpython-1.9.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 3ee2d5701cc7050ea7f07a5b89b929d6716c6cf34b01c79d7a359ba41bcabcdc
MD5 8c71c0ccdc299ab76324cd010c1fbd23
BLAKE2b-256 b241d409a98943714dd10082e420a4f021bb0d394172d780072526e5276a8211

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page