Skip to main content

tkitScrapyMongoPipeline,

Project description

tkitScrapyMongoPipeline

MongoPipeline

https://docs.scrapy.org/en/latest/topics/item-pipeline.html

数据存储到mongodb

# settings
import tkitScrapyMongoPipeline
# 1、设置MongoDB 的数据库地址
MONGO_URI = "mongodb://192.168.123.117:27017/"
MONGO_DATABASE="test"
# # 2、启用中间件MongoPipeline
ITEM_PIPELINES = {
   # 'base.pipelines.DuplicatesPipeline': 100,
   'base.pipelines.MongoPipeline': 100,
}








# item 字段示例
 item={
#设置强制去重复字段
"unique_id":url,
"title": title, "url": url, "content": text, "site": "playbarkrun.com", "content_type": "content",
# 设置表名
"collection_name":"test11"
}

详细参考

dev.md

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Built Distribution

tkitScrapyMongoPipeline-0.0.0.116522771-py2.py3-none-any.whl (7.8 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page