Skip to main content

NO CODE!!! Base on Scrapy, crawl websites with simple configuration.

Project description

====== dig-spider

Overview

Dig-spider is a BSD-licensed fast high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing.

Dig-spider is a code-free crawler. It is based on scrapy, and support the same command line with scrapy.

Requirements

  • Python 3.9+
  • Scrapy 2.11+
  • Works on Linux, Windows, macOS, BSD

Install

The quick way:

pip install dig-spider

Usage

dig-spider gentemplate dst

generate template to target directory (dst) modify config_template.yaml and code_template.py

dig-spider crawl website -a config=dst/config_template.yaml

use dig-spider to replace scrapy, website is the default spider, dst/config_template.yaml is the webpage parse rule.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dig_spider-0.0.8.tar.gz (13.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dig_spider-0.0.8-py3-none-any.whl (18.4 kB view details)

Uploaded Python 3

File details

Details for the file dig_spider-0.0.8.tar.gz.

File metadata

  • Download URL: dig_spider-0.0.8.tar.gz
  • Upload date:
  • Size: 13.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for dig_spider-0.0.8.tar.gz
Algorithm Hash digest
SHA256 9ccb7846baf40e07744604e902ecb205ce583940be2007d0e67876bc0960b294
MD5 8f96501bc151936572ff4ac3b3b1672c
BLAKE2b-256 115d1ba53ccfb898af1d20352fbda0dac0df68b63ccfd2ed285271d9627d7f2d

See more details on using hashes here.

File details

Details for the file dig_spider-0.0.8-py3-none-any.whl.

File metadata

  • Download URL: dig_spider-0.0.8-py3-none-any.whl
  • Upload date:
  • Size: 18.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.3

File hashes

Hashes for dig_spider-0.0.8-py3-none-any.whl
Algorithm Hash digest
SHA256 2f43de6adca421951ce4c6df186baf4c41247620e12e5d9d592a7897cdce417c
MD5 d0c4359786eab5ccc72491066ff1e5f8
BLAKE2b-256 4c7ce87b2a7d757b2191c976ac3cd78b7265ad504a3be5bb04a6b345ce3754ae

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page