Skip to main content

feapder是一款支持分布式、批次采集、任务防丢、报警丰富的python爬虫框架

Project description

FEAPDER

Downloads Downloads Downloads

简介

feapder是一款上手简单,功能强大的Python爬虫框架

读音: [ˈfiːpdə]

1.拥有强大的监控,保障数据质量

监控面板:点击查看详情

2. 内置多维度的报警(支持 钉钉、企业微信、邮箱)

3. 简单易用,内置三种爬虫,可应对各种需求场景

  • AirSpider 轻量爬虫:学习成本低,可快速上手

  • Spider 分布式爬虫:支持断点续爬、爬虫报警、数据自动入库等功能

  • BatchSpider 批次爬虫:可周期性的采集数据,自动将数据按照指定的采集周期划分。(如每7天全量更新一次商品销量的需求)

feapder对外暴露的接口类似scrapy,可由scrapy快速迁移过来。支持断点续爬数据防丢监控报警浏览器渲染下载海量数据去重等功能

文档地址

环境要求:

  • Python 3.6.0+
  • Works on Linux, Windows, macOS

安装

From PyPi:

通用版

pip3 install feapder

完整版:

pip3 install feapder[all]

通用版与完整版区别:

  1. 完整版支持基于内存去重

完整版可能会安装出错,若安装出错,请参考安装问题

小试一下

创建爬虫

feapder create -s first_spider

创建后的爬虫代码如下:

import feapder


class FirstSpider(feapder.AirSpider):
    def start_requests(self):
        yield feapder.Request("https://www.baidu.com")

    def parse(self, request, response):
        print(response)


if __name__ == "__main__":
    FirstSpider().start()
        

直接运行,打印如下:

Thread-2|2021-02-09 14:55:11,373|request.py|get_response|line:283|DEBUG|
                -------------- FirstSpider.parse request for ----------------
                url  = https://www.baidu.com
                method = GET
                body = {'timeout': 22, 'stream': True, 'verify': False, 'headers': {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.93 Safari/537.36'}}

<Response [200]>
Thread-2|2021-02-09 14:55:11,610|parser_control.py|run|line:415|DEBUG| parser 等待任务...
FirstSpider|2021-02-09 14:55:14,620|air_spider.py|run|line:80|INFO| 无任务,爬虫结束

代码解释如下:

  1. start_requests: 生产任务
  2. parse: 解析数据

爬虫工具推荐

  1. 爬虫在线工具库:http://www.spidertools.cn
  2. 验证码识别库:https://github.com/sml2h3/ddddocr

微信赞赏

如果您觉得这个项目帮助到了您,您可以帮作者买一杯咖啡表示鼓励 🍹

也可和作者交个朋友,解决您在使用过程中遇到的问题

赞赏码

学习交流

知识星球:17321694 作者微信: boris_tm QQ群号:750614606

加好友备注:feapder

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

feapder-1.7.7b3.tar.gz (180.1 kB view details)

Uploaded Source

Built Distributions

feapder-1.7.7b3-py3.9.egg (411.2 kB view details)

Uploaded Source

feapder-1.7.7b3-py3-none-any.whl (194.7 kB view details)

Uploaded Python 3

File details

Details for the file feapder-1.7.7b3.tar.gz.

File metadata

  • Download URL: feapder-1.7.7b3.tar.gz
  • Upload date:
  • Size: 180.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7

File hashes

Hashes for feapder-1.7.7b3.tar.gz
Algorithm Hash digest
SHA256 847c1369a476ef35ea4f942e9326f0dfa4e6a401c5145a1ca57d2035d955e9df
MD5 cbb66bbc2b11cdf2d974ab720e7957ab
BLAKE2b-256 b133e8e2490428c78f5373d8bc8395b0bbe4b20cd82b12d4b18ccc8c92f23296

See more details on using hashes here.

File details

Details for the file feapder-1.7.7b3-py3.9.egg.

File metadata

  • Download URL: feapder-1.7.7b3-py3.9.egg
  • Upload date:
  • Size: 411.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7

File hashes

Hashes for feapder-1.7.7b3-py3.9.egg
Algorithm Hash digest
SHA256 4c1d7a0ca20da25d7e9ad35dd4d00c0f371f4d006fc638d2d7bc6c9b26a7cf3d
MD5 432dd8d8955fd517e7f469bcfe35bab7
BLAKE2b-256 1f57c46d30a7fb76ca0e88f428e4ff518c7d1a1ef33d3ee6258a7b0f43b014d4

See more details on using hashes here.

File details

Details for the file feapder-1.7.7b3-py3-none-any.whl.

File metadata

  • Download URL: feapder-1.7.7b3-py3-none-any.whl
  • Upload date:
  • Size: 194.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7

File hashes

Hashes for feapder-1.7.7b3-py3-none-any.whl
Algorithm Hash digest
SHA256 75179a841c441b7294874e1820af16e23634aa8b68970606c3d112fad81e2594
MD5 c1e90fe2ae8d21f4a113bf6f9bb89456
BLAKE2b-256 efb7486b53c9806263eac63f2f87dfeb91a16269b0b7c4753757ab288952aff4

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page