Skip to main content

Asynchronous Processors/Workflow management for django.

Project description

Version : 0.2.0 Author : Thomas Weholt <thomas@weholt.org> License : Modified BSD Status : pre-alpha Url : https://bitbucket.org/weholt/django-kolibri

Background

Kolibri is a reusable django app for designing and executing asynchronous processes and workflows. A workflow is a collections of steps in a defined order, processing data in each step. A step can break the flow if an exception is raised and/or a specified step can be executed to handle a specific exception. Kolibri uses celery to handle processing in the background. All processors and workflows can only be started by staff members, but more fine grained access control might be implemented in future versions.

The project got started because I needed to control how I added content to a photo project I’m developing in django. The project involved lots of heavy processes like thumbnail generation and metadata processing. Adding content consists of steps that needs to be done in a specific order, and I need to control what action to take if one step throws an exception. I was using celery, but adding a new step or process was tedious and I wanted more dynamic way of defining and managing processors.

The current implementation is not stable and a proof of concept. Comments very welcome, especially on how to monitor status of celery processes and provide feedback to the user.

Features

  • asynchronous processes, which can process items/querysets or execute processes not related to specific models or instances (sending email, scanning filesystems etc)

  • connect several processors into workflows, with exception handling, clean-up steps and an optional fluent interface

  • template tags to handle execution of processors/workflows for an item or queryset in your templates

  • admin action integration for your models

  • dashboard listing running processors

  • a concept of pending processors and a history of what has been processed so you don’t execute unnecessary processesors or workflows

  • user exclusive processors so two users can execute the same processor at the same time without touching the same data

  • logging and history, with direct link to processed instances

  • ajax integration using jquery

Planned features

  • better examples, more detailed tutorial and actual documentation in the source

  • full-blown dashboard with feedback and progress from running processes and some way of killing processes

  • nicely formatted logs and history for processed items

  • a way of telling users that something is going on with the item they’re looking at (progressbar, growl notification etc.)

Installation

pip install django-kolibri

or

hg clone https://bitbucket.org/weholt/django-kolibri python setup.py install

  • set STATIC_ROOT and STATIC_URL in settings.py

  • add ‘kolibri’ to your installed apps

  • add url(r’^kolibri/’, include(‘kolibri.urls’)), to your urls.py

It would be smart to read through usage.txt first for a more detailed tutorial or experiment with the working example project provided in the source, available at bitbucket.

Requirements

  • Django

  • Celery / django-celery

Example usage

The simplest processor you can define looks something like:

from kolibri.core import *
from models import *

dirty_words = ('foo', 'fudge', 'bar',)

class RemoveProfanity(Processor):
    model = Article

    def process(self, user, article, **kwargs):
        for dirty_word in dirty_words:
            article.text = article.text.replace(dirty_word,'*'*len(dirty_word))
        article.save()

manager.register.processor(RemoveProfanity())

It’s a very simple processor which replaces all dirty words, defined in dirty_words, with * from instances of a model called Article.

To create a workflow, connecting a series of processors:

from kolibri.core import manager
from kolibri.core.workflow import Workflow

workflow = Workflow('Publish article', model=Article)
workflow.first(RemoveProfanity()).on_exception(ValueError, DirtyWordRemover()).\
    then(PublishArticle()).then(ArchiveArticle())

manager.register.workflow(workflow)

Here we create a workflow called “Publish article” for the Article-model. First we remove all profanity using the RemoveProfanity, if RemoveProfanity raises an ValueError we run the DirtyWordRemover-processor, then we publish the article using a processor called PublishArticle and finally we archive it.

See the usage.txt document in the source for more examples and in-depth explanation of features.

Release notes

  • 0.2.0 - support for user input. See bottom of usage description for more info.

  • 0.1.1 - Added support for only running a processor once for an instance.

  • 0.1.0 - Initial release. Pre-alpha state.

Kolibri usage

This documentation is related to the 0.1.0-release of kolibri and syntax and functionality WILL change in future releases as long as it is labeled as pre-alpha/alpha. Reaching Beta-status only small changes in code will be introduced.

NB! The following assumes to have installed celery/django-celery and configured it to run as stated in the celery documentation. The example project in the source is using djkombu, which makes it alot easier to get up and running.

A simple app called article defines a model:

class Article(models.Model):
    title = models.CharField(max_length=128)
    text = models.TextField()
    parental_advisory = models.BooleanField(default=False)
    author = models.ForeignKey(User, related_name='articles')
    publish = models.BooleanField(default=False)
    archived = models.BooleanField(default=False)

    def __unicode__(self):
        return self.title

We define a processor to remove dirty words in a file called processors.py in the same app-folder. The name of the file doesn’t matter as long as the processor is registered using the manager, as shown at the bottom of this snippet of code:

from kolibri.core import *
from models import *

dirty_words = ('foo', 'bar', 'fudge',)

class RemoveProfanity(Processor):
    model = Article

    def process(self, user, item, **kwargs):
        for dirty_word in dirty_words:
            item.text = item.text.replace(dirty_word, '*'*len(dirty_word))
        item.save()

manager.register.processor(RemoveProfanity())

Note:

  1. You must subclass Processor from kolibri.core

2. The name of your processor will be used in the admin. Using CamelCase (http://en.wikipedia.org/wiki/CamelCase) the name will be transformed into a text more suitable for reading. In our example here RemoveProfanity will become “Remove profanity” in the admin.

  1. All processors you want to show up in the admin must specify a model they’re related to.

4. All processors MUST implement the processor-method and the signature of the method MUST look like the one in the example above, with the exception if the item parameter, which can be called whatever you like. In this example it would be nicer to call it article so go ahead.

  1. The processor should not touch any other data than the item provided.

Simple testing:

$ python manage.py shell
>>> from django.contrib.auth.models import User
>>> usr = User.objects.all()[0] # an allready registered user we can use for testing purposes
>>> from article.models import Article
>>> article1 = Article.objects.create(title="Dirty words", text="Some dirty words: foo bar fudge.", author=usr)
>>> from article.processors import RemoveProfanity
>>> RemoveProfanity().process(usr, article1)
>>> article1.text
'Some dirty words: *** *** *****.'

To make it available in the admin, add an admin.py file to your app-folder:

from django.contrib import admin
from models import *
from kolibri.core import manager

class ArticleAdmin(admin.ModelAdmin):
    fields = ('title', 'text', 'parental_advisory', 'author', 'publish', 'archived', )
    list_display = ('title', 'parental_advisory', 'author', 'publish', 'archived', )
    list_filter = ('parental_advisory', 'author', 'publish', 'archived', )
    actions = manager.admin.actions_for_model(Article)

admin.site.register(Article, ArticleAdmin)

The important part here is the “actions =”-line. It assigns all available processors related to the article-model. Now you can select several articles in the admins change_list and apply a processor to all of them.You can also make the processors available in the change_form for an instance, but extending the change_form in your app. Create a templates-folder in your app-folder, with a subfolder called admin, with a subfolder article and inside it put a file called change_form.html:

{% extends "admin/change_form.html" %}
{% load kolibri_tags %}

{% block extrahead %}
    <script src="http://code.jquery.com/jquery-latest.js"></script>
    <script type="text/javascript" src="{% value_from_settings "STATIC_URL" %}js/kolibri/kolibri.js"></script>
    <link rel="stylesheet" type="text/css" href="{% value_from_settings "STATIC_URL" %}css/kolibri/kolibri.css" />

    <script>
        url_to_active_processor_list = "{% url active_processes %}";
        url_to_processor_status = "{% url processor_status  %}";
    </script>
{% endblock %}

{% block footer %}
<div id="change_form_box">
{% kolibrify_admin original %}
</div>
{% endblock %}

It will load the kolibri template tags, add some javascript references to jquery and some kolibri js, but the magic happens when you call {% kolibrify_admin original %}. It will insert a list of available processors and workflows you can execute for the object you’re looking at.

NB! The kolibrify-templatetag changed in version 0.2.0. Inside you’ll have to use the kolibrify_admin templatetag to use the “admin/base_site.html” template to render your page.

To make the same list of processors available outside the admin, you can do something like this for a list of objects:

{% load kolibri_tags %}
<html>
<head>
    {% kolibri_imports %}
    <style>
        body {
            margin: 20px;
            padding:  20px;
            font-size: 12px;
            font-family: "Lucida Grande","DejaVu Sans","Bitstream Vera Sans",Verdana,Arial,sans-serif;
            color: #333;
            background: #fff;
        }
    </style>

</head>
<body>

<h2>Articles</h2>

{% kolibrify articles "article/article_list.html"  %}

</body>
</html>

Here we “kolibrifies” a queryset, a bunch of articles, using a template defined in your apps template folder, like so:

<ul>
{% for article in object_list %}
    <li><input type="checkbox" name="pk_id_{{ article.id }}" value="{{ article.id }}"/><a href="{%  url details article.id %}">{{ article }}</a></li>
{%  endfor %}
</ul>

The only requirement for this to work is that you have a value named pk_id_SomethingUniqueForEachItem with value of the model.id to apply the processors to. Something like this for a details-page for an article:

{% load kolibri_tags %}
<html>
<head>
    {% kolibri_imports %}
    <style>
        body {
            margin: 20px;
            padding:  20px;
            font-size: 12px;
            font-family: "Lucida Grande","DejaVu Sans","Bitstream Vera Sans",Verdana,Arial,sans-serif;
            color: #333;
            background: #fff;
        }

    </style>
</head>
<body>
<i></i><a href="/index">Index</a></i>
<P/>
<h2>{{ article.title }}</h2>
<span style="font-size:small;">by {{ article.author }}. {%  if article.parental_advisory %}<b>Warning!</b> This article contains explicit language.{% endif %}
{%  if article.publish %} This article has been published.{% endif %}</span>

<blockquote>{{  article.text }}</blockquote>

{% kolibrify article %}

</body>
</html>

And sometimes we want to do more things with our data, but in a specific order. This is were Workflows comes in:

workflow = Workflow('Publish article', model=Article)
workflow.first(SetParentalAdvisory()).on_exception(ValueError, RemoveDirtyWords()).\
    then(PublishArticle()).and_finally(CleanUpAfterPublishing())
manager.register.workflow(workflow)

This workflow first attempts to mark all articles containing profanity with a “Parental Advisory” flag. If that fails and a ValueError-exception is raised, RemoveDirtyWords will remove all dirty words. You can specify several processors to handle different exceptions. Then the workflow will publish the article and finally it will do some house cleaning using the CleanUpAfterPublishing-processor. When using and_finally(SomeProcessor) that processor always be called last in a try…finally-block surrounding the other steps in your workflow.

I know - not a very good example. Hopefully the examples will improve with future releases as well ;-).

Workflows are available in the admin and in your templates when you “kolibrifies” an instance or queryset, just like processors.

Finally a processor can be executed not related to a specific model, for insance scanning a filesystem looking for new images or sending an email ( for some reason ). To be able to do this we must implement a new method in our processor:

import os

class ImportArticleFromHomeFolder(Processor):
    model = Article
    user_exclusive = True

    def execute(self, user, **kwargs):
        path = os.path.join('/home', user.username)
        for filename in os.listdir(path):
            fname, ext = os.path.splitext(filename)
            if ext == '.txt':
                Article.objects.create(title=fname, text=open(os.path.join(path, filename)).read(), author=user)

The execute-method takes a user-instance and optional kwargs as parameter. The processor is quite dumb and only scans the users homefolder, adding all files with a .txt-extension to the database using the filecontent as text for an article and the filename as a title.

To execute such processors, click on the “Kolibri”-link in the admin, the link taking you to the app-index page. This page contains what will hopefully become a usable dashboard for kolibri later on. For now it lists registered processors and workflows, shows a list of pending processors and each listed processor can be click for more details. If you’ve implemented the execute-method as shown above there will be an “Execute”-button available. Click it and your test database will have some articles in it (assuming you got some textfiles in your home folder ).

The processor shown above also introduce another concept, user_exclusive. When setting this to true you indicate that your processor will only change data related to the user provided as parameter to the execute-method ( or the process-method). This makes it possible to let several users execute or process data without the risk of updating data related to another user.

–Getting using input–

Introduced in version 0.2.0. To enable your processor to take user input using a form, extend your processor like so:

class PublishArticle(Processor):
    model = Article
    has_form = True
    form_comment = "This processor will publish your article on CoolNewsSite.com."

    def process(self, user, article,**kwargs):
        article.publish = True
        article.save()

    def execute(self, user,**kwargs):
        for article in Article.objects.all():
            self.process(user, article, **kwargs)

    def get_form(self):
        class PublishForm(forms.Form):
            username = forms.CharField()
            password = forms.CharField()
        return PublishForm, {}

Notice the has_form = True line in the beginning and the new get_form-method. This method should return two things;

  • the form, not an instance, but the actual class

  • a dictionary for initial values for your form, if any. If no initial data is specified you’ll still have to return a dict

You can also provide the user with some additional info, perhaps a more detailed explanation of what you expect him/her to enter in the form. This is done by adding a form_comment attribute like shown above. The form used for form rendering extends base_site.html so you’ll have to define one of those.

Change to the kolibri templatetags:

  • kolibri_index will no render an index which extends base_site.html. If you want to use it in the admin, call kolibri_index_admin instead.

  • kolibrify is now meant to be used outside the admin. Call kolibrify_admin when inside the admin.

Project details


Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page