parselx

Enhanced version of parsel, extracting data from HTML and XML using complex rules

Project description

Enhanced version of parsel, extracting data from HTML and XML using complex rules.

Features

Magic g method: extract items by complex rules
Apply filters to a value
x instance: many helper methods and filters

Plus all the standard features of parsel

>>> from parselx import SelectorX
>>> sel = SelectorX("""<html>
        <body>
            <h1>Hello, Parselx!</h1>
            <ul>
                <li><a href="http://example.com">Link 1</a></li>
                <li><a href="http://scrapy.org">Link 2</a></li>
            </ul>
        </body>
        </html>""")
>>>
>>> sel.g('h1')
'Hello, Parselx!'
>>> sel.g('h1 | reverse')
'!xlesraP ,olleH'
>>> sel.g('[ul li a]')
['Link 1', 'Link 2']
>>> sel.g({'title':['h1', lambda s: s.upper()], 'links':'[a @href]'})
{'title': 'HELLO, PARSELX!', 'links': ['http://example.com', 'http://scrapy.org']}
>>> sel.g('[ul li a @href| map:slice,7,-4]')
['example', 'scrapy']

Installation

$ pip install parselx

Project details

Release history Release notifications | RSS feed

This version

0.0.4

Feb 10, 2020

0.0.3

Nov 17, 2019

0.0.2

Nov 8, 2019

0.0.1

Nov 6, 2019

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

parselx-0.0.4.tar.gz (5.1 kB view details)

Uploaded Feb 10, 2020 Source

File details

Details for the file parselx-0.0.4.tar.gz.

File metadata

Download URL: parselx-0.0.4.tar.gz
Upload date: Feb 10, 2020
Size: 5.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.37.0 CPython/3.7.4

File hashes

Hashes for parselx-0.0.4.tar.gz
Algorithm	Hash digest
SHA256	`5082b8e8b95150bf0b6d00784e3c4fc6525c4df8f46dc5caddf684ea0dc37b70`
MD5	`77387781fdebd7e5aa1a3abbb61c1544`
BLAKE2b-256	`bd0a6dd4870d8677361800cfba67dddc07f1a57c78ae7d1cb4d3f5f29f720ed2`

See more details on using hashes here.

parselx 0.0.4

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta