Skip to main content

A full-featured web UI for Scrapyd cluster management, with Scrapy log analysis & visualization supported.

Project description

English | 简体中文

ScrapydWeb: A full-featured web UI for Scrapyd cluster management, with Scrapy log analysis & visualization supported.

PyPI - scrapydweb Version Downloads - total PyPI - Python Version Coverage Status GitHub license Twitter

servers

Scrapyd ScrapydWeb LogParser

Recommended Reading

How to efficiently manage your distributed web scraping projects

Features

View contents
  • Scrapyd Cluster Management

    • All Scrapyd JSON API Supported
    • Group, filter and select any number of nodes
    • Execute command on multinodes with just a few clicks
  • Scrapy Log Analysis

    • Stats collection
    • Progress visualization
    • Logs categorization
  • Enhancements

    • Auto packaging
    • Integrated with LogParser
    • Timer tasks
    • :e-mail: Email notice
    • Mobile UI
    • Basic auth for web UI

Preview

Getting Started

View contents

Prerequisites

Make sure that Scrapyd has been installed and started on all of your hosts.

Note that for remote access, you have to manually set 'bind_address = 0.0.0.0' in the configuration file of Scrapyd and restart Scrapyd to make it visible externally.

Install

  • Use pip:
pip install scrapydweb
  • Use git:
git clone https://github.com/my8100/scrapydweb.git
cd scrapydweb
python setup.py install

Start

  1. Start ScrapydWeb via command scrapydweb. (a config file would be generated for customizing settings at the first startup.)
  2. Visit http://127.0.0.1:5000 (It's recommended to use Google Chrome for a better experience.)

Browser Support

The latest version of Google Chrome, Firefox, and Safari.

Running the tests

View contents
$ git clone https://github.com/my8100/scrapydweb.git
$ cd scrapydweb

# To create isolated Python environments
$ pip install virtualenv
$ virtualenv venv/scrapydweb
# Or specify your Python interpreter: $ virtualenv -p /usr/local/bin/python3.7 venv/scrapydweb
$ source venv/scrapydweb/bin/activate

# Install dependent libraries
(scrapydweb) $ python setup.py install
(scrapydweb) $ pip install pytest
(scrapydweb) $ pip install coverage

# Make sure Scrapyd has been installed and started, then update the custom_settings item in tests/conftest.py
(scrapydweb) $ vi tests/conftest.py
(scrapydweb) $ curl http://127.0.0.1:6800

# '-x': stop on first failure
(scrapydweb) $ coverage run --source=scrapydweb -m pytest tests/test_a_factory.py -s -vv -x
(scrapydweb) $ coverage run --source=scrapydweb -m pytest tests -s -vv --disable-warnings
(scrapydweb) $ coverage report
# To create an HTML report, check out htmlcov/index.html
(scrapydweb) $ coverage html

Built With

View contents

Changelog

Detailed changes for each release are documented in the HISTORY.md.

Author


my8100

Contributors


Kaisla

License

This project is licensed under the GNU General Public License v3.0 - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scrapydweb-1.2.0.tar.gz (616.2 kB view hashes)

Uploaded source

Built Distribution

scrapydweb-1.2.0-py3-none-any.whl (674.2 kB view hashes)

Uploaded py3

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Huawei Huawei PSF Sponsor Microsoft Microsoft PSF Sponsor NVIDIA NVIDIA PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page