ruia_motor - a Ruia plugin that uses the motor to store data
Project description
ruia-motor
A Ruia plugin that uses the motor to store data
Notice: Works on ruia >= 0.8.0
Installation
pip install -U ruia-motor
Usage
ruia-motor will be automatically store data to mongodb:
from ruia import AttrField, Item, Response, Spider, TextField
from ruia_motor import RuiaMotorInsert, RuiaMotorUpdate, init_spider
class HackerNewsItem(Item):
target_item = TextField(css_select="tr.athing")
title = TextField(css_select="a.storylink")
url = AttrField(css_select="a.storylink", attr="href")
async def clean_title(self, value):
return value.strip()
class HackerNewsSpider(Spider):
start_urls = ["https://news.ycombinator.com/news?p=1"]
aiohttp_kwargs = {"proxy": "http://0.0.0.0:1087"}
async def parse(self, response: Response):
async for item in HackerNewsItem.get_items(html=await response.text()):
# Update data
# https://motor.readthedocs.io/en/stable/api-asyncio/asyncio_motor_collection.html#motor.motor_asyncio.AsyncIOMotorCollection.update_one
yield RuiaMotorUpdate(
collection="hn_demo",
filter={"title": item.title},
update={"$set": item.results},
upsert=True,
)
# Insert data
# https://motor.readthedocs.io/en/stable/api-asyncio/asyncio_motor_collection.html#motor.motor_asyncio.AsyncIOMotorCollection.insert_one
# yield RuiaMotorInsert(collection="hn_demo", data=item.results)
async def init_plugins_after_start(spider_ins):
spider_ins.mongodb_config = {"host": "127.0.0.1", "port": 27017, "db": "ruia_motor"}
init_spider(spider_ins=spider_ins)
if __name__ == "__main__":
HackerNewsSpider.start(after_start=init_plugins_after_start)
Enjoy it :)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
ruia_motor-0.0.4.tar.gz
(3.9 kB
view details)
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ruia_motor-0.0.4.tar.gz.
File metadata
- Download URL: ruia_motor-0.0.4.tar.gz
- Upload date:
- Size: 3.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.20.1 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
91e96c03f2bd47cd24b3f54acee6337e27c4806a00b5bb1aadfa2f6fc54a9bf1
|
|
| MD5 |
d491a33e8bd349a3eaf969381f4e420d
|
|
| BLAKE2b-256 |
221bca0b512bc42d28245c00c7758af84504d0f6c99c7caa69e8cf2f69802a64
|
File details
Details for the file ruia_motor-0.0.4-py3-none-any.whl.
File metadata
- Download URL: ruia_motor-0.0.4-py3-none-any.whl
- Upload date:
- Size: 6.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.20.1 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
60c985415235402c6b45042af232005a654445e465f7b1a9ccef25ae76bceb45
|
|
| MD5 |
2b073c1ae4645d25e81af84c0ab1de06
|
|
| BLAKE2b-256 |
b2f355cb545608ccf2b87aa17ed5b2ead81b86d54807a6a3b2569b329039ccfa
|