Skip to main content

ruia_motor - a Ruia plugin that uses the motor to store data

Project description

ruia-motor

A Ruia plugin that uses the motor to store data

Notice:  Works on ruia >= 0.8.0

Installation

pip install -U ruia-motor

Usage

ruia-motor will be automatically store data to mongodb:

from ruia import AttrField, Item, Response, Spider, TextField

from ruia_motor import RuiaMotorInsert, RuiaMotorUpdate, init_spider


class HackerNewsItem(Item):
    target_item = TextField(css_select="tr.athing")
    title = TextField(css_select="a.storylink")
    url = AttrField(css_select="a.storylink", attr="href")

    async def clean_title(self, value):
        return value.strip()


class HackerNewsSpider(Spider):
    start_urls = ["https://news.ycombinator.com/news?p=1"]
    aiohttp_kwargs = {"proxy": "http://0.0.0.0:1087"}

    async def parse(self, response: Response):
        async for item in HackerNewsItem.get_items(html=await response.text()):
            # Update data
            # https://motor.readthedocs.io/en/stable/api-asyncio/asyncio_motor_collection.html#motor.motor_asyncio.AsyncIOMotorCollection.update_one
            yield RuiaMotorUpdate(
                collection="hn_demo",
                filter={"title": item.title},
                update={"$set": item.results},
                upsert=True,
            )
            # Insert data
            # https://motor.readthedocs.io/en/stable/api-asyncio/asyncio_motor_collection.html#motor.motor_asyncio.AsyncIOMotorCollection.insert_one
            # yield RuiaMotorInsert(collection="hn_demo", data=item.results)


async def init_plugins_after_start(spider_ins):
    spider_ins.mongodb_config = {"host": "127.0.0.1", "port": 27017, "db": "ruia_motor"}
    init_spider(spider_ins=spider_ins)


if __name__ == "__main__":
    HackerNewsSpider.start(after_start=init_plugins_after_start)

Enjoy it :)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ruia_motor-0.0.5.tar.gz (4.7 kB view details)

Uploaded Source

Built Distribution

ruia_motor-0.0.5-py3-none-any.whl (6.1 kB view details)

Uploaded Python 3

File details

Details for the file ruia_motor-0.0.5.tar.gz.

File metadata

  • Download URL: ruia_motor-0.0.5.tar.gz
  • Upload date:
  • Size: 4.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.7

File hashes

Hashes for ruia_motor-0.0.5.tar.gz
Algorithm Hash digest
SHA256 969b1dbc29d84bf83f966d779fcb879cad847c07c2ae8e2b0599f56fbd38b0ea
MD5 9faae5d47a6049fa4820368cff5747ed
BLAKE2b-256 11a5dd28699aa8eb983e4d251fad45c086fef216c73973f82cd20d920dc751f5

See more details on using hashes here.

File details

Details for the file ruia_motor-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: ruia_motor-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 6.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.7

File hashes

Hashes for ruia_motor-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 73fa5c26c88b33d5e2dc5e72cfe1bf78399ff8da532e51fb5cc65e9b0433005a
MD5 0d5a3c2403a3a242ba259629646038b7
BLAKE2b-256 f87997ed1ef2df8af14ff29587e58a75d98f0683b07600007f465458c141eacb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page