A python3 library for download Movie comment from Maoyan
Project description
# 猫眼电影数据爬虫
### __两行代码 爬取想要的影评__
* 环境配置
1. Docker方式(推荐)
使用 打包好的Dockerfile构建(暂未提供)
2. 传统部署(默认已安装python环境)
***
步骤:
1. pip install requirement.txt 安装工程所需模块
2. sudo apt install mongodb 安装mongodb(使用txt保存数据可跳过)
---
* 使用方法
```python
# 引入Maoyan类
from crawel_utils.download import Maoyan
if __name__ == '__main__':
# movie_id是电影对应的猫眼id,pegesize是选择下载评论的页数,thread_max仅用于多线程下载,为线程数
maoyan = Maoyan(movie_id=1175253, page_size=40, thread_max=20)
# 保存到mongodb
maoyan.multi_thread_download(func=maoyan.save_to_mongo)
# 保存到txt文本
maoyan.multi_thread_download(func=maoyan.save_to_txt)
```
### __两行代码 爬取想要的影评__
* 环境配置
1. Docker方式(推荐)
使用 打包好的Dockerfile构建(暂未提供)
2. 传统部署(默认已安装python环境)
***
步骤:
1. pip install requirement.txt 安装工程所需模块
2. sudo apt install mongodb 安装mongodb(使用txt保存数据可跳过)
---
* 使用方法
```python
# 引入Maoyan类
from crawel_utils.download import Maoyan
if __name__ == '__main__':
# movie_id是电影对应的猫眼id,pegesize是选择下载评论的页数,thread_max仅用于多线程下载,为线程数
maoyan = Maoyan(movie_id=1175253, page_size=40, thread_max=20)
# 保存到mongodb
maoyan.multi_thread_download(func=maoyan.save_to_mongo)
# 保存到txt文本
maoyan.multi_thread_download(func=maoyan.save_to_txt)
```
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
SimpleCat-1.1.2.tar.gz
(5.5 kB
view details)
File details
Details for the file SimpleCat-1.1.2.tar.gz.
File metadata
- Download URL: SimpleCat-1.1.2.tar.gz
- Upload date:
- Size: 5.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.14.2 setuptools/36.4.0 requests-toolbelt/0.8.0 tqdm/4.15.0 CPython/3.6.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cb627bc5d8425a321bd67dda7a7ead90032982166e387c7dd4c85d41c89ae29d
|
|
| MD5 |
40cf661f38dd23c8444558f9dbffa273
|
|
| BLAKE2b-256 |
b000cb112a99d09ca89a341fcb2f4b29a113caae663b3b0db03623b534b31af0
|