Declarative web parsers
Project description
Soup Stars
Soup Stars is a framework for building web parsers with Python. It is designed to make building, deploying, and scheduling web parsers easier by simplifying what you need to get started.
Quickstart
pip install soupstars
The client is also available as a docker image.
docker pull soupstars/client
Building a parser
Create a new parser using the soupstars
command. The create
command will use a template parser.
soupstars create -m myparser.py
Parsers are simple python modules.
cat myparser.py
Notice that the only set up required is the special parse
decorator and a variable named url
for the web page you want to parse.
from soupstars import parse
url = "https://corbettanalytics.com/"
@parse
def h1(soup):
return soup.h1.text
You can test that the parser functions correctly.
soupstars run -m myparser.py
Use soupstars --help
to see a full list of available commands.
More documentation is available here.
Development
Create a virtual environment with python3.6
virtualenv venv --python=python3.6
Install the package in development mode.
venv/bin/pip3 install --requirement requirements.txt
venv/bin/pip3 install --editable .
Run the tests.
venv/bin/pytest -v
venv/bin/flake soupstars examples
Releasing
New tags that pass on CI will automatically be pushed to PyPI and docker hub.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for soupstars-2.10.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | f95206fbc76eae973a560fdaf01067d1e697a4b74f714ca99366d174308b6483 |
|
MD5 | a1a68c900a3cd35309309f5849910470 |
|
BLAKE2b-256 | edf645813c12220fa9f2df4667e446676f144e86322b77439d7b93451cbc404e |