Skip to main content

Declarative web parsers

Project description

Soup Stars

Build Status

Version Python

Soup Stars is a framework for building web parsers with Python. It is designed to make building, deploying, and scheduling web parsers easier by simplifying what you need to get started.

Quickstart

pip install soupstars

The client is also available as a docker image.

docker pull soupstars/client

Building a parser

Create a new parser using the soupstars command. The create command will use a template parser.

soupstars create -m myparser.py

Parsers are simple python modules.

cat myparser.py

Notice that the only set up required is the special parse decorator and a variable named url for the web page you want to parse.

from soupstars import parse

url = "https://corbettanalytics.com/"

@parse
def h1(soup):
    return soup.h1.text

You can test that the parser functions correctly.

soupstars run -m myparser.py

Use soupstars --help to see a full list of available commands.

More documentation is available here.

Development

Start the docker services.

docker-compose up -d

Run the tests.

docker-compose run --rm client pytest -vs

Releasing

New tags that pass on CI will automatically be pushed to PyPI and docker hub.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

soupstars-2.11.7.tar.gz (13.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

soupstars-2.11.7-py3-none-any.whl (18.0 kB view details)

Uploaded Python 3

File details

Details for the file soupstars-2.11.7.tar.gz.

File metadata

  • Download URL: soupstars-2.11.7.tar.gz
  • Upload date:
  • Size: 13.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/39.2.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.5

File hashes

Hashes for soupstars-2.11.7.tar.gz
Algorithm Hash digest
SHA256 289a4d47affdecc7f5d80ee002b47462e9f79d4e392f8d38774e07301540dc19
MD5 21b47d67dec3e52b57ca6354ae1ee327
BLAKE2b-256 429beaa30fe4170d077bba3de6da72ad4d36d1e1c7ad34fccfc1182464bc28c2

See more details on using hashes here.

File details

Details for the file soupstars-2.11.7-py3-none-any.whl.

File metadata

  • Download URL: soupstars-2.11.7-py3-none-any.whl
  • Upload date:
  • Size: 18.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/39.2.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.5

File hashes

Hashes for soupstars-2.11.7-py3-none-any.whl
Algorithm Hash digest
SHA256 d1675865ffd057306bd031e7d55a8cedbf8d90cdcfe941fa5cdd0d828d3fabef
MD5 7b3df21a0df00ef1ca0a7f33c77443f6
BLAKE2b-256 c2d47dd78393c1d1682722101d278cb411755e5c1f66f6e87ae883a1068fde4d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page