Skip to main content

feapder是一款支持分布式、批次采集、任务防丢、报警丰富的python爬虫框架

Project description

FEAPDER

Downloads Downloads Downloads

简介

  1. feapder是一款上手简单,功能强大的Python爬虫框架,内置AirSpider、Spider、TaskSpider、BatchSpider四种爬虫解决不同场景的需求。
  2. 支持断点续爬、监控报警、浏览器渲染、海量数据去重等功能。
  3. 更有功能强大的爬虫管理系统feaplat为其提供方便的部署及调度

读音: [ˈfiːpdə]

Feapder

文档地址

环境要求:

  • Python 3.6.0+
  • Works on Linux, Windows, macOS

安装

From PyPi:

通用版

pip3 install feapder

完整版:

pip3 install feapder[all]

通用版与完整版区别:

  1. 完整版支持基于内存去重

完整版可能会安装出错,若安装出错,请参考安装问题

小试一下

创建爬虫

feapder create -s first_spider

创建后的爬虫代码如下:

import feapder


class FirstSpider(feapder.AirSpider):
    def start_requests(self):
        yield feapder.Request("https://www.baidu.com")

    def parse(self, request, response):
        print(response)


if __name__ == "__main__":
    FirstSpider().start()
        

直接运行,打印如下:

Thread-2|2021-02-09 14:55:11,373|request.py|get_response|line:283|DEBUG|
                -------------- FirstSpider.parse request for ----------------
                url  = https://www.baidu.com
                method = GET
                body = {'timeout': 22, 'stream': True, 'verify': False, 'headers': {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.93 Safari/537.36'}}

<Response [200]>
Thread-2|2021-02-09 14:55:11,610|parser_control.py|run|line:415|DEBUG| parser 等待任务...
FirstSpider|2021-02-09 14:55:14,620|air_spider.py|run|line:80|INFO| 无任务,爬虫结束

代码解释如下:

  1. start_requests: 生产任务
  2. parse: 解析数据

爬虫工具推荐

  1. 爬虫在线工具库:http://www.spidertools.cn
  2. 爬虫管理系统:http://feapder.com/#/feapder_platform/feaplat
  3. 验证码识别库:https://github.com/sml2h3/ddddocr

参与贡献

贡献之前请先阅读 贡献指南

感谢所有做过贡献的人!

微信赞赏

如果您觉得这个项目帮助到了您,您可以帮作者买一杯咖啡表示鼓励 🍹

也可和作者交个朋友,解决您在使用过程中遇到的问题

赞赏码

学习交流

知识星球:17321694 作者微信: boris_tm QQ群号:485067374

加好友备注:feapder

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

feapder-1.8.6b6.tar.gz (190.7 kB view details)

Uploaded Source

Built Distribution

feapder-1.8.6b6-py3-none-any.whl (211.3 kB view details)

Uploaded Python 3

File details

Details for the file feapder-1.8.6b6.tar.gz.

File metadata

  • Download URL: feapder-1.8.6b6.tar.gz
  • Upload date:
  • Size: 190.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.5.0 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.28.2 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7

File hashes

Hashes for feapder-1.8.6b6.tar.gz
Algorithm Hash digest
SHA256 5f9aef5750704387eee7dbb55ca8043433782d3f1e5f82ddcd049b0b95aaea1f
MD5 b2e64e117f77e821d723dcad1f1a9755
BLAKE2b-256 614504366e11796a708ae4dcad7a4c5683c04853a0ff1c4ce12de9cdc0f858cf

See more details on using hashes here.

File details

Details for the file feapder-1.8.6b6-py3-none-any.whl.

File metadata

  • Download URL: feapder-1.8.6b6-py3-none-any.whl
  • Upload date:
  • Size: 211.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.5.0 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.28.2 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.9.7

File hashes

Hashes for feapder-1.8.6b6-py3-none-any.whl
Algorithm Hash digest
SHA256 de71f99ad685c617e6150af08b6bb57f12d284ef347d37818a41ac8d3ed96bc1
MD5 b6b863a0da2b8dd9dd16ba0c3b02c036
BLAKE2b-256 0603cf9e2dcab01a410b495053338f71568e1ee9ac14b428016784c781f1397e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page