Skip to main content

Declarative web parsers

Project description

Soupstars :stew: :star: :boom:

Build Status Coverage Status Docs Version Image

Soupstars makes it easier than ever to build web parsers in Python.

Install it with pip.

pip install soupstars

Let's go!

Quickstart

You need two objects to get started.

>>> from soupstars import Parser, serialize

We'll build a parser to extract data from a github page.

>>> class GithubParser(Parser):
...    "Parse data from a github page"
...
...    @serialize
...    def title(self):
...        return str(self.h1.text.strip())

Now all we need is a github web page to parse.

>>> parser = GithubParser("https://github.com/tjwaterman99/soupstars")

Let's see what we've got!

>>> parser.to_dict()
{'title': 'tjwaterman99/soupstars'}

You're now ready to start building your own web parsers with soupstars. Nice job. :beers:

Going further

Contributing

We're thrilled you asked! Just open a PR on github, and we'll take a look.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

soupstars-1.0.2.tar.gz (3.6 kB view details)

Uploaded Source

File details

Details for the file soupstars-1.0.2.tar.gz.

File metadata

  • Download URL: soupstars-1.0.2.tar.gz
  • Upload date:
  • Size: 3.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.20.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/2.7.14

File hashes

Hashes for soupstars-1.0.2.tar.gz
Algorithm Hash digest
SHA256 97c1b4f814aaa9dfb0cd2c3f6718227bd9140597ec4ff5521122d129cc04a8c2
MD5 89cc7116edc3c42b4d78fb77490b3f52
BLAKE2b-256 d67cb3a9b3884b0cc5721bee088f1a618448f57c838753be0b9c9e7f4fa34c47

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page