Skip to main content

Declarative web parsers

Project description

Soupstars :stew: :star: :boom:

Build Status Coverage Status Docs Version Image

Soupstars makes it easier than ever to build web parsers in Python.

Install it with pip.

pip install soupstars

Let's go!

Quickstart

You need two objects to get started.

>>> from soupstars import Parser, serialize

We'll build a parser to extract data from a github page.

>>> class GithubParser(Parser):
...    "Parse data from a github page"
...
...    @serialize
...    def title(self):
...        return str(self.h1.text.strip())

Now all we need is a github web page to parse.

>>> parser = GithubParser("https://github.com/tjwaterman99/soupstars")

Let's see what we've got!

>>> parser.to_dict()
{'title': 'tjwaterman99/soupstars'}

You're now ready to start building your own web parsers with soupstars. Nice job. :beers:

Going further

Contributing

We're thrilled you asked! Just open a PR on github, and we'll take a look.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

soupstars-1.0.4.tar.gz (3.6 kB view details)

Uploaded Source

File details

Details for the file soupstars-1.0.4.tar.gz.

File metadata

  • Download URL: soupstars-1.0.4.tar.gz
  • Upload date:
  • Size: 3.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.20.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/2.7.14

File hashes

Hashes for soupstars-1.0.4.tar.gz
Algorithm Hash digest
SHA256 5910d118a90ebbe44b4ff8837d16c657a36c56621007fba6b1d956794391242e
MD5 d6d3cd72ed0fd8fc17341af0c727eff8
BLAKE2b-256 26473f1b71c66f33ebd1214fce153e029a4c95a1e9e145b3331833e9d3b056b4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page