Skip to main content

NO CODE!!! Base on Scrapy, crawl websites with simple configuration.

Project description

====== dig-spider

Overview

Dig-spider is a BSD-licensed fast high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.

Dig-spider is a code-free crawler. It is based on scrapy, and support the same command line with scrapy.

Requirements

  • Python 3.9+
  • Scrapy 2.11+
  • Works on Linux, Windows, macOS, BSD

Install

The quick way:

pip install dig-spider

Usage

dig-spider gentemplate dst

generate template to target directory (dst) modify config_template.yaml and code_template.py

dig-spider crawl website -a config=dst/config_template.yaml

use dig-spider to replace scrapy, website is the default spider, dst/config_template.yaml is the webpage parse rule.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dig_spider-0.0.7.tar.gz (10.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dig_spider-0.0.7-py3-none-any.whl (13.3 kB view details)

Uploaded Python 3

File details

Details for the file dig_spider-0.0.7.tar.gz.

File metadata

  • Download URL: dig_spider-0.0.7.tar.gz
  • Upload date:
  • Size: 10.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for dig_spider-0.0.7.tar.gz
Algorithm Hash digest
SHA256 005384a6f47a296bdc8a64bcfb626641d8e8a9066cc01f1ebf5c8ab12c5d2070
MD5 79a2831cee2e35e4cfe783dc20cf2a2c
BLAKE2b-256 9d122ce7d413397353fdd777b05844964ebb69fb043585e44fe89e7aa5eaeda6

See more details on using hashes here.

File details

Details for the file dig_spider-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: dig_spider-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 13.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for dig_spider-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 c6612c904c9897fc0a92cb037e51c2c65ce273db5201d60b62dce7c83f09129e
MD5 ee9552c90cefac02274bf52038008cba
BLAKE2b-256 31c30f7362baab64ada73deaf26aa69b236f954af42b7925954a303cc900170f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page