Skip to main content

ruia_motor - a Ruia plugin that uses the motor to store data

Project description

ruia-motor

A Ruia plugin that uses the motor to store data

Notice:  Works on ruia >= 0.8.0

Installation

pip install -U ruia-motor

Usage

ruia-motor will be automatically store data to mongodb:

from ruia import AttrField, Item, Response, Spider, TextField

from ruia_motor import RuiaMotorInsert, RuiaMotorUpdate, init_spider


class HackerNewsItem(Item):
    target_item = TextField(css_select="tr.athing")
    title = TextField(css_select="a.storylink")
    url = AttrField(css_select="a.storylink", attr="href")

    async def clean_title(self, value):
        return value.strip()


class HackerNewsSpider(Spider):
    start_urls = ["https://news.ycombinator.com/news?p=1"]
    aiohttp_kwargs = {"proxy": "http://0.0.0.0:1087"}

    async def parse(self, response: Response):
        async for item in HackerNewsItem.get_items(html=await response.text()):
            # Update data
            # https://motor.readthedocs.io/en/stable/api-asyncio/asyncio_motor_collection.html#motor.motor_asyncio.AsyncIOMotorCollection.update_one
            yield RuiaMotorUpdate(
                collection="hn_demo",
                filter={"title": item.title},
                update={"$set": item.results},
                upsert=True,
            )
            # Insert data
            # https://motor.readthedocs.io/en/stable/api-asyncio/asyncio_motor_collection.html#motor.motor_asyncio.AsyncIOMotorCollection.insert_one
            # yield RuiaMotorInsert(collection="hn_demo", data=item.results)


async def init_plugins_after_start(spider_ins):
    spider_ins.mongodb_config = {"host": "127.0.0.1", "port": 27017, "db": "ruia_motor"}
    init_spider(spider_ins=spider_ins)


if __name__ == "__main__":
    HackerNewsSpider.start(after_start=init_plugins_after_start)

Enjoy it :)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ruia_motor-0.0.4.tar.gz (3.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ruia_motor-0.0.4-py3-none-any.whl (6.1 kB view details)

Uploaded Python 3

File details

Details for the file ruia_motor-0.0.4.tar.gz.

File metadata

  • Download URL: ruia_motor-0.0.4.tar.gz
  • Upload date:
  • Size: 3.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.20.1 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.3

File hashes

Hashes for ruia_motor-0.0.4.tar.gz
Algorithm Hash digest
SHA256 91e96c03f2bd47cd24b3f54acee6337e27c4806a00b5bb1aadfa2f6fc54a9bf1
MD5 d491a33e8bd349a3eaf969381f4e420d
BLAKE2b-256 221bca0b512bc42d28245c00c7758af84504d0f6c99c7caa69e8cf2f69802a64

See more details on using hashes here.

File details

Details for the file ruia_motor-0.0.4-py3-none-any.whl.

File metadata

  • Download URL: ruia_motor-0.0.4-py3-none-any.whl
  • Upload date:
  • Size: 6.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.20.1 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.3

File hashes

Hashes for ruia_motor-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 60c985415235402c6b45042af232005a654445e465f7b1a9ccef25ae76bceb45
MD5 2b073c1ae4645d25e81af84c0ab1de06
BLAKE2b-256 b2f355cb545608ccf2b87aa17ed5b2ead81b86d54807a6a3b2569b329039ccfa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page