Skip to main content

ruia_motor - a Ruia plugin that uses the motor to store data

Project description

ruia-motor

A Ruia plugin that uses the motor to store data

Notice:  Works on ruia >= 0.8.0

Installation

pip install -U ruia-motor

Usage

ruia-motor will be automatically store data to mongodb:

from ruia import AttrField, Item, Response, Spider, TextField

from ruia_motor import RuiaMotorInsert, RuiaMotorUpdate, init_spider


class HackerNewsItem(Item):
    target_item = TextField(css_select="tr.athing")
    title = TextField(css_select="a.storylink")
    url = AttrField(css_select="a.storylink", attr="href")

    async def clean_title(self, value):
        return value.strip()


class HackerNewsSpider(Spider):
    start_urls = ["https://news.ycombinator.com/news?p=1"]
    aiohttp_kwargs = {"proxy": "http://0.0.0.0:1087"}

    async def parse(self, response: Response):
        async for item in HackerNewsItem.get_items(html=await response.text()):
            # Update data
            # https://motor.readthedocs.io/en/stable/api-asyncio/asyncio_motor_collection.html#motor.motor_asyncio.AsyncIOMotorCollection.update_one
            yield RuiaMotorUpdate(
                collection="hn_demo",
                filter={"title": item.title},
                update={"$set": item.results},
                upsert=True,
            )
            # Insert data
            # https://motor.readthedocs.io/en/stable/api-asyncio/asyncio_motor_collection.html#motor.motor_asyncio.AsyncIOMotorCollection.insert_one
            # yield RuiaMotorInsert(collection="hn_demo", data=item.results)


async def init_plugins_after_start(spider_ins):
    spider_ins.mongodb_config = {"host": "127.0.0.1", "port": 27017, "db": "ruia_motor"}
    init_spider(spider_ins=spider_ins)


if __name__ == "__main__":
    HackerNewsSpider.start(after_start=init_plugins_after_start)

Enjoy it :)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for ruia-motor, version 0.0.4
Filename, size File type Python version Upload date Hashes
Filename, size ruia_motor-0.0.4-py3-none-any.whl (6.1 kB) File type Wheel Python version py3 Upload date Hashes View
Filename, size ruia_motor-0.0.4.tar.gz (3.9 kB) File type Source Python version None Upload date Hashes View

Supported by

AWS AWS Cloud computing Datadog Datadog Monitoring DigiCert DigiCert EV certificate Facebook / Instagram Facebook / Instagram PSF Sponsor Fastly Fastly CDN Google Google Object Storage and Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Salesforce Salesforce PSF Sponsor Sentry Sentry Error logging StatusPage StatusPage Status page