Skip to main content

Use asyncio and aiohttp's concatenated web crawler framework

Project description

安装: pip install asyncpy

## Asynpy

Asynpy 是一个异步请求框架,基于asyncio和aiohttp,实现便捷高效的从网络上抓取数据

“With reference to the design pattern of scrapy, an asynchronous co-processing method is adopted to simplify some operations and complete the tasks of web crawler more easily and efficiently”

### asyncio asyncio是Python 3.4版本引入用来编写并发代码的标准库,直接内置了对异步IO的支持。 asyncio的编程模型就是一个单线程的消息循环。 从asyncio模块中直接获取一个EventLoop的引用,然后把需要执行的协程扔到EventLoop中执行,就实现了异步IO。

### aiohttp aiohttp - 基于asyncio实现的HTTP框架。 aiohttp强调的是异步并发。提供了对asyncio/await的支持,可以实现单线程并发IO操作。

#### 单线程的asyncio为什么速度会更快呢? - 多线程是基于系统的,协程是基于线程的,多线程上下文切换有陷入内核态的消耗,协程就没有,协程更轻量 - Python的GIL阻止两个线程在同一个程序中同时执行,有时候多线程并没有单线程速度快。 - 而Scrapy是基于twited的单线程框架,消息队列和事件循环在一个线程来回切换,downloader是多线程的

#### 详细操作文档请参考博客或者github:

https://blog.csdn.net/weixin_43582101/article/details/106258518

https://github.com/lixi5338619/asyncpy

Thanks : Scrapy、Ruia、Looter、asyncio、aiohttp

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for asyncpy, version 1.1.5
Filename, size File type Python version Upload date Hashes
Filename, size asyncpy-1.1.5-py3-none-any.whl (17.2 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size asyncpy-1.1.5.tar.gz (13.2 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page