Skip to main content

NO CODE!!! Base on Scrapy, crawl websites with simple configuration.

Project description

====== dig-spider

Overview

Dig-spider is a BSD-licensed fast high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.

Dig-spider is a code-free crawler. It is based on scrapy, and support the same command line with scrapy.

Requirements

  • Python 3.9+
  • Scrapy 2.11+
  • Works on Linux, Windows, macOS, BSD

Install

The quick way:

.. code:: bash

pip install dig-spider

Usage

.. code:: bash

dig-spider crawl website -a config=xxx.yaml

use dig-spider to replace scrapy, website is the default spider, xxx.yaml is the webpage parse rule.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dig_spider-0.0.6.tar.gz (9.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dig_spider-0.0.6-py3-none-any.whl (12.1 kB view details)

Uploaded Python 3

File details

Details for the file dig_spider-0.0.6.tar.gz.

File metadata

  • Download URL: dig_spider-0.0.6.tar.gz
  • Upload date:
  • Size: 9.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for dig_spider-0.0.6.tar.gz
Algorithm Hash digest
SHA256 1c7e601d627711461c7eb8aad4601bbbb112d77a4e9cb9b8d0855cdfa8fc871e
MD5 60fec464240994a6e5a7606acdd24ed0
BLAKE2b-256 3e9871310aac2d6d50c7615bad7f88913c37834816c04f672a946695ddaf64a7

See more details on using hashes here.

File details

Details for the file dig_spider-0.0.6-py3-none-any.whl.

File metadata

  • Download URL: dig_spider-0.0.6-py3-none-any.whl
  • Upload date:
  • Size: 12.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for dig_spider-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 0e47f5b7450780b4675905802711a02410b22569d31f37b57c8e10ebae4f27b6
MD5 fa0b9a7ab947215e71320607cd72b2bd
BLAKE2b-256 5efa766a99016fd6e0f0a5e8f6e43be87d165abdbaf59bb415f6ba75ffd87e4c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page