Skip to main content

ruia_motor - a Ruia plugin that uses the motor to store data

Project description


A Ruia plugin that uses the motor to store data

Notice:  Works on ruia >= 0.5.0


pip install -U ruia-motor


ruia-motor will be automatically store data to mongodb:

from ruia import AttrField, Item, Spider, TextField
from ruia_motor import RuiaMotor

class DoubanItem(Item):
    target_item = TextField(css_select='div.item')
    title = TextField(css_select='span.title')
    cover = AttrField(css_select='div.pic>a>img', attr='src')
    abstract = TextField(css_select='span.inq', default='')

    async def clean_title(self, title):
        if isinstance(title, str):
            return title
            return ''.join([i.text.strip().replace('\xa0', '') for i in title])

class DoubanSpider(Spider):
    start_urls = ['']

    mongodb_config = {
        'host': '',
        'port': 27017,
        'db': 'ruia_motor'

    async def parse(self, response):
        etree = response.html_etree
        pages = ['?start=0&filter='] + [i.get('href') for i in etree.cssselect('.paginator>a')]
        for index, page in enumerate(pages):
            url = self.start_urls[0] + page
            yield self.request(
                metadata={'index': index},

    async def parse_item(self, response):
        async for item in DoubanItem.get_items(html=response.html):
            data = item.results
            yield RuiaMotor(collection='douban250', data=data)

async def init_plugins_after_start(spider_ins):

if __name__ == '__main__':

Enjoy it :)

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for ruia-motor, version 0.0.3
Filename, size File type Python version Upload date Hashes
Filename, size ruia_motor-0.0.3-py3-none-any.whl (5.7 kB) File type Wheel Python version py3 Upload date Hashes View hashes
Filename, size ruia_motor-0.0.3.tar.gz (3.5 kB) File type Source Python version None Upload date Hashes View hashes

Supported by

Elastic Elastic Search Pingdom Pingdom Monitoring Google Google BigQuery Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page