Declarative web parsers
Project description
Soupstars
Build declarative web parsers in python. Install it with pip.
pip install soupstars
A full example:
from soupstars import Parser, serialize
class NytimesArticleParser(Parser):
"Parse data from a NY times article"
@serialize
def title(self):
return self.h1.text
@serialize
def author(self):
return self.find(attrs={'itemprop': 'author creator'}).text
if __name__ == "__main__":
url = "https://www.nytimes.com/2019/04/25/us/politics/joe-biden-anita-hill.html"
parser = NytimesArticleParser(url)
print(parser.to_json())
Running the script above produces:
{
'author': 'By Sheryl Gay Stolberg and Carl Hulse',
'title': 'Joe Biden Expresses Regret to Anita Hill, but She Says ‘I’m Sorry’ Is Not Enough'
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
No source distribution files available for this release.See tutorial on generating distribution archives.
Built Distribution
Close
Hashes for soupstars-1.0.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b61096ec6f46f0300b5df177c3824b2d5399cf2c64dea32d2b0c9c484d513b92 |
|
MD5 | 0bc834258917f38a4ffbaeb4b73c07ea |
|
BLAKE2b-256 | 9eaf808e33f10493f5eb7a4c736186ab0b7597b5731b1037accab1f59fb5b733 |