ruia_motor - a Ruia plugin that uses the motor to store data
Project description
ruia-motor
A Ruia plugin that uses the motor to store data
Notice: Works on ruia >= 0.8.0
Installation
pip install -U ruia-motor
Usage
ruia-motor
will be automatically store data to mongodb:
from ruia import AttrField, Item, Response, Spider, TextField
from ruia_motor import RuiaMotorInsert, RuiaMotorUpdate, init_spider
class HackerNewsItem(Item):
target_item = TextField(css_select="tr.athing")
title = TextField(css_select="a.storylink")
url = AttrField(css_select="a.storylink", attr="href")
async def clean_title(self, value):
return value.strip()
class HackerNewsSpider(Spider):
start_urls = ["https://news.ycombinator.com/news?p=1"]
aiohttp_kwargs = {"proxy": "http://0.0.0.0:1087"}
async def parse(self, response: Response):
async for item in HackerNewsItem.get_items(html=await response.text()):
# Update data
# https://motor.readthedocs.io/en/stable/api-asyncio/asyncio_motor_collection.html#motor.motor_asyncio.AsyncIOMotorCollection.update_one
yield RuiaMotorUpdate(
collection="hn_demo",
filter={"title": item.title},
update={"$set": item.results},
upsert=True,
)
# Insert data
# https://motor.readthedocs.io/en/stable/api-asyncio/asyncio_motor_collection.html#motor.motor_asyncio.AsyncIOMotorCollection.insert_one
# yield RuiaMotorInsert(collection="hn_demo", data=item.results)
async def init_plugins_after_start(spider_ins):
spider_ins.mongodb_config = {"host": "127.0.0.1", "port": 27017, "db": "ruia_motor"}
init_spider(spider_ins=spider_ins)
if __name__ == "__main__":
HackerNewsSpider.start(after_start=init_plugins_after_start)
Enjoy it :)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
ruia_motor-0.0.5.tar.gz
(4.7 kB
view details)
Built Distribution
File details
Details for the file ruia_motor-0.0.5.tar.gz
.
File metadata
- Download URL: ruia_motor-0.0.5.tar.gz
- Upload date:
- Size: 4.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 969b1dbc29d84bf83f966d779fcb879cad847c07c2ae8e2b0599f56fbd38b0ea |
|
MD5 | 9faae5d47a6049fa4820368cff5747ed |
|
BLAKE2b-256 | 11a5dd28699aa8eb983e4d251fad45c086fef216c73973f82cd20d920dc751f5 |
File details
Details for the file ruia_motor-0.0.5-py3-none-any.whl
.
File metadata
- Download URL: ruia_motor-0.0.5-py3-none-any.whl
- Upload date:
- Size: 6.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.1 CPython/3.9.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 73fa5c26c88b33d5e2dc5e72cfe1bf78399ff8da532e51fb5cc65e9b0433005a |
|
MD5 | 0d5a3c2403a3a242ba259629646038b7 |
|
BLAKE2b-256 | f87997ed1ef2df8af14ff29587e58a75d98f0683b07600007f465458c141eacb |