ruia_motor - a Ruia plugin that uses the motor to store data
Project description
ruia-motor
A Ruia plugin that uses the motor to store data
Notice: Works on ruia >= 0.8.0
Installation
pip install -U ruia-motor
Usage
ruia-motor
will be automatically store data to mongodb:
from ruia import AttrField, Item, Response, Spider, TextField
from ruia_motor import RuiaMotorInsert, RuiaMotorUpdate, init_spider
class HackerNewsItem(Item):
target_item = TextField(css_select="tr.athing")
title = TextField(css_select="a.storylink")
url = AttrField(css_select="a.storylink", attr="href")
async def clean_title(self, value):
return value.strip()
class HackerNewsSpider(Spider):
start_urls = ["https://news.ycombinator.com/news?p=1"]
aiohttp_kwargs = {"proxy": "http://0.0.0.0:1087"}
async def parse(self, response: Response):
async for item in HackerNewsItem.get_items(html=await response.text()):
# Update data
# https://motor.readthedocs.io/en/stable/api-asyncio/asyncio_motor_collection.html#motor.motor_asyncio.AsyncIOMotorCollection.update_one
yield RuiaMotorUpdate(
collection="hn_demo",
filter={"title": item.title},
update={"$set": item.results},
upsert=True,
)
# Insert data
# https://motor.readthedocs.io/en/stable/api-asyncio/asyncio_motor_collection.html#motor.motor_asyncio.AsyncIOMotorCollection.insert_one
# yield RuiaMotorInsert(collection="hn_demo", data=item.results)
async def init_plugins_after_start(spider_ins):
spider_ins.mongodb_config = {"host": "127.0.0.1", "port": 27017, "db": "ruia_motor"}
init_spider(spider_ins=spider_ins)
if __name__ == "__main__":
HackerNewsSpider.start(after_start=init_plugins_after_start)
Enjoy it :)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
ruia_motor-0.0.4.tar.gz
(3.9 kB
view hashes)
Built Distribution
Close
Hashes for ruia_motor-0.0.4-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 60c985415235402c6b45042af232005a654445e465f7b1a9ccef25ae76bceb45 |
|
MD5 | 2b073c1ae4645d25e81af84c0ab1de06 |
|
BLAKE2b-256 | b2f355cb545608ccf2b87aa17ed5b2ead81b86d54807a6a3b2569b329039ccfa |