Use asyncio and aiohttp's concatenated web crawler framework
Project description
安装: pip install asyncpy
## Asynpy
Asynpy 是一个异步请求框架,基于asyncio和aiohttp,实现便捷高效的从网络上抓取数据
“With reference to the design pattern of scrapy, an asynchronous co-processing method is adopted to simplify some operations and complete the tasks of web crawler more easily and efficiently”
### asyncio asyncio是Python 3.4版本引入用来编写并发代码的标准库,直接内置了对异步IO的支持。 asyncio的编程模型就是一个单线程的消息循环。 从asyncio模块中直接获取一个EventLoop的引用,然后把需要执行的协程扔到EventLoop中执行,就实现了异步IO。
### aiohttp aiohttp - 基于asyncio实现的HTTP框架。 aiohttp强调的是异步并发。提供了对asyncio/await的支持,可以实现单线程并发IO操作。
#### 单线程的asyncio为什么速度会更快呢? - 多线程是基于系统的,协程是基于线程的,多线程上下文切换有陷入内核态的消耗,协程就没有,协程更轻量 - Python的GIL阻止两个线程在同一个程序中同时执行,有时候多线程并没有单线程速度快。 - 而Scrapy是基于twited的单线程框架,消息队列和事件循环在一个线程来回切换,downloader是多线程的
#### 详细操作文档请参考博客或者github:
https://blog.csdn.net/weixin_43582101/article/details/106258518
https://github.com/lixi5338619/asyncpy
Thanks : Scrapy、Ruia、Looter、asyncio、aiohttp
公众号《Pythonlx》
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file asyncpy-1.2.0.tar.gz
.
File metadata
- Download URL: asyncpy-1.2.0.tar.gz
- Upload date:
- Size: 13.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.7.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 761b891750243db84afc0f9c0de89d4ffb01f5786f2bde65d202522520c7933d |
|
MD5 | d31d5b529db9251378c0b5546c4262e3 |
|
BLAKE2b-256 | 74aebc3dd080b5ced473e787674974a74b3a2be9d4b9be582c991032fb2e7d11 |
File details
Details for the file asyncpy-1.2.0-py3-none-any.whl
.
File metadata
- Download URL: asyncpy-1.2.0-py3-none-any.whl
- Upload date:
- Size: 17.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.7.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ad1cb2dc04abecb7591ee39092bbf8850b710cd2766b05ec9b714a6d28cb4484 |
|
MD5 | 9c8b5886581ae17aaaadaddb5ec481da |
|
BLAKE2b-256 | fa4242ac2211df840c2b405dc655739d1d4e46e38714622f5908a4966c10cb7d |