Crawlab implements with file uploader.
Project description
Crawlab Python SDK
基本用法
from crawlabpy.utils import notify_target, save_file, PART_CONTENT_TYPE
img = requests.get(url=img_url).content
img_name = hashlib.md5(img).hexdigest() + '.jpg'
save_file(img_name, img)
result = {
'body': body,
'body_cn': '',
'body_en': body,
'covering': img_name_list,
'downdate': down_date, # 爬虫时间
'id': news_id,
'languageid': '10001',
'languagename': '英语',
'pagedate': down_date,
'partid': '3', # 板块号
'partname': '对外关系,国际关系', # 板块名
'resource': img_name_list,
'sectionid': '',
'sectionname': '缅甸',
'siteid': '153',
'sitename': '缅甸新闻网',
'title': news_title[0],
'title_cn': '',
'title_en': news_title[0],
'url': news_url,
'viewcount': '',
'writer': ''
}
save_item(result)
notify_target(PART_CONTENT_TYPE, result, img_name_list)
开始开发
开发环境初始化
git clone https://github.com/gditsec/crawlabpy.git
cd crawlerpy
make dev
source venv/bin/activate
python -m pip install -r requirements.txt
构建代码
make build
发布代码
make release
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
crawlabpy-0.0.8.tar.gz
(3.6 kB
view hashes)
Built Distribution
Close
Hashes for crawlabpy-0.0.8-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 613131549152c6e81a4d2d722ad8a419b8d136999cf8f517b36ec1ba7318c218 |
|
MD5 | 66c57b6187cb0608782d3f8c6ddf5575 |
|
BLAKE2b-256 | c1db6a204ea35c4a96044fbeed2684525b1c31fbc92e214d21cf883caf968a1c |