An Easy Scraper for HTML
Project description
easy-scraper-py
An easy scraping tool for HTML
Goal
Re-implementation of tanakh/easy-scraper in Python.
Install from PyPI
pip install easy-scraper-py
Usage Example
<!-- Target -->
<body>
<b>NotMe</b>
<a class=here>Here</a>
<a class=nothere>NotHere</a>
</body>
<!-- Pattern -->
<a class=here>{{ text }}</a>
import easy_scraper
target = r"""<body>
<b>NotMe</b>
<a class=here>Here</a>
<a class=nothere>NotHere</a>
</body>
""" # newlines and spaces are all ignored.
pattern = "<a class=here>{{ text }}</a>"
easy_scraper.match(target, pattern) # [{'text': 'Here'}]
# XML (RSS) scraping
import easy_scraper
import urllib.request
body = urllib.request.urlopen("https://kuragebunch.com/rss/series/10834108156628842505").read().decode()
res = easy_scraper.match(body, "<item><title>{{ title }}</title><link>{{ link }}</link></item>")
for item in res[:5]:
print(item)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
easy-scraper-py-0.1.3.tar.gz
(4.1 kB
view hashes)
Built Distribution
Close
Hashes for easy_scraper_py-0.1.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 99ff831eff6e14b15d35d8fae4e8e4ad28f126e9c0d174748fd11283e17b2ee4 |
|
MD5 | 49b34278f44fa2b6a781ff9e297a5af3 |
|
BLAKE2b-256 | 0a64bac3da08558b24d961922ecb86584e85304a52695f782d8bad696fac503f |