An out-of-the-box lightweight asynchronous crawler framework
Project description
traspider
简介
traspider是一个开箱即用的轻量爬虫框架
如果你需要写一个小的爬虫,使用traspider会让你事半功倍
github地址 : https://github.com/Ntrashh/traspider
文档地址: https://ntrashh.github.io/traspider/
环境要求
- Python 3.7.0+
- Works on Linux, Windows, macOS
安装
pip3 install traspider
使用
创建爬虫
traspider create -s demo_spider
生成代码
添加需要爬取的网址 http://httpbin.org/
from loguru import logger
from traspider import Spider
class DemoSpider(Spider):
def __init__(self):
self.urls = ["http://httpbin.org/"]
def parser(self, response, request):
logger.info(response)
async def download_middleware(self, request):
request.headers = {
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36"
}
return request
if __name__ == "__main__":
demo_spider = DemoSpider()
demo_spider.start()
traspider这个项目开始之初就是为了爬虫在开发一些简单的项目能够更轻更快,所以对大型项目支持还是不够好。如果开发的是大型爬虫项目,推荐你使用feapder和scrapy
如果你在使用过程中对traspider有任何问题或建议可以联系我
微信:
鸣谢
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
traspider-0.1.0.tar.gz
(22.0 kB
view details)
Built Distribution
traspider-0.1.0-py3-none-any.whl
(26.1 kB
view details)
File details
Details for the file traspider-0.1.0.tar.gz
.
File metadata
- Download URL: traspider-0.1.0.tar.gz
- Upload date:
- Size: 22.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 88e2dbb990dbe411565fd8725efda01e256a6f04388bdbe5927db81cf8896988 |
|
MD5 | 9cf0a216f3f3a3e37cf39196f32166b8 |
|
BLAKE2b-256 | d026cbe991a51d14a118ab938ecee3ccf25d59328cadfb5af944f489c654a3f8 |
File details
Details for the file traspider-0.1.0-py3-none-any.whl
.
File metadata
- Download URL: traspider-0.1.0-py3-none-any.whl
- Upload date:
- Size: 26.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.8.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7867c174629657828404a4e072ce6db48af5f42d2b99ce5d63a28f721f17c69f |
|
MD5 | 01d05275df9085936437ecf3487a8ba2 |
|
BLAKE2b-256 | 9cb18e59c94cd10ef1a411e0afc16a7330100299e36e9e388cebd76d510f12d3 |