Skip to main content

Declarative web parsers

Project description

Soupstars :stew: :star: :boom:

Build Status Coverage Status Docs Version Image

Soupstars makes it fast and easy to build web parsers in Python.

It supports python 3.7+

Quickstart

Install it with pip.

pip install soupstars

Create a new parser using the soupstars command. The create command will use a template parser.

soupstars create -m myparser.py

Parsers are simple python modules.

cat myparser.py

Notice that the only set up required is the special parse decorator and a variable named url for the web page you want to parse.

from soupstars import parse

url = "https://corbettanalytics.com/"

@parse
def h1(soup):
    return soup.h1.text

You can test that the parser functions correctly.

soupstars test -m myparser.py

The output is a json object.

{
    "data": {
        "h1": "Level up your analytics"
    },
    "status": 200,
    "url": "https://corbettanalytics.com/",
    "errors": {}
}

More feature are available in the CLI. Type soupstars --help to see the list of commands

Usage: soupstars [OPTIONS] COMMAND [ARGS]...

  CLI to interact with SoupStars cloud.

Options:
  --help  Show this message and exit.

Commands:
  config    Print the configuration used by the client
  create    Create a new parser from a template
  health    Print the status of the SoupStars api
  login     Log in with an existing email
  ls        Show the parsers uploaded to SoupStars cloud
  pull      Pull a parser from SoupStars cloud into a local module
  push      Push a parser to SoupStars cloud
  register  Register a new account on SoupStars cloud
  run       Run a parser on SoupStars cloud
  show      Show the contents of a parser on SoupStars cloud
  test      Test running a parser locally
  version   Print the SoupStars version in use
  whoami    Print the email address of the current user

Deploying to soupstars.cloud

You can deploy your parsers to be ran on our service.

Use the CLI to create an account. You'll be prompted for a username and password.

soupstars register

Upload your parser.

soupstars push -m myparser.py

You can now run the parser from our service.

soupstars run -m myparser.py

Development

Install the package in development mode.

pip install -r requirements.txt
pip install --editable .

Run the tests.

pytest -v

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

soupstars-2.0.9.tar.gz (5.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

soupstars-2.0.9-py3-none-any.whl (7.3 kB view details)

Uploaded Python 3

File details

Details for the file soupstars-2.0.9.tar.gz.

File metadata

  • Download URL: soupstars-2.0.9.tar.gz
  • Upload date:
  • Size: 5.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.35.0 CPython/3.7.1

File hashes

Hashes for soupstars-2.0.9.tar.gz
Algorithm Hash digest
SHA256 e8619a5bf1e79cf4a3d8d9b2e5a2c5522cc2484d39e6766e7d37f2600b7f097f
MD5 ce9edeacd362929dc1480bacbf2480ba
BLAKE2b-256 6a3afe63582323cd651b6ddb7012318f607d810e312aad5a5a43a4f59c9a66df

See more details on using hashes here.

File details

Details for the file soupstars-2.0.9-py3-none-any.whl.

File metadata

  • Download URL: soupstars-2.0.9-py3-none-any.whl
  • Upload date:
  • Size: 7.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.35.0 CPython/3.7.1

File hashes

Hashes for soupstars-2.0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 03b621c5e8a8d10901ff4c3c29f1d1eed1919140433edcf5918bc7846283b16b
MD5 95050bad514d2f1d4b4945e1a6c58565
BLAKE2b-256 a7d66b0b56e2ca0021c18a4253db4ba63f51fae1f989017918ed19a3c83647f0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page