Skip to main content

Django Bayesian inference based comment moderation app.

Project description

Django Moderator
================
**Django community trained Bayesian inference based comment moderation app.**

.. contents:: Contents
:depth: 5

``django-moderator`` integrates Django's comments framework with SpamBayes_ to classify comments into one of four categories, *ham*, *spam*, *reported* or *unsure*, based on training by users (see Paul Graham's `A Plan for Spam <http://www.paulgraham.com/spam.html>`_ for some background).

Users classify comments as *reported* using a *report abuse* mechanic. Staff users can then classify these *reported* comments as *ham* or *spam*, thereby training the algorithm to automatically classify similarly worded comments in future. Additionally comments the algorithm fails to clearly classify as either *ham* or *spam* will be classified as *unsure*, allowing staff users to manually classify them as well via admin.

Comments classified as *spam* will have their ``is_removed`` field set to ``True`` and as such will no longer be visible in comment listings.

Comments *reported* by users will have their ``is_removed`` field set to ``True`` and as such will no longer be visible in comment listings.

Comments classified as *ham* or *unsure* will remain unchanged and as such will be visible in comment listings.

``django-moderator`` also implements a user friendly admin interface for efficiently moderating comments.


Installation
------------

#. Install or add ``django-moderator`` to your Python path.

#. Add ``moderator`` to your ``INSTALLED_APPS`` setting.

#. Configure ``django-likes`` as described `here <http://pypi.python.org/pypi/django-likes>`_.

#. Add a ``MODERATOR`` setting to your project's ``settings.py`` file. This setting specifies what classifier storage backend to use (see below) and also classification thresholds::

MODERATOR = {
'CLASSIFIER': 'moderator.storage.DjangoClassifier',
'HAM_CUTOFF': 0.3,
'SPAM_CUTOFF': 0.7,
'ABUSE_CUTOFF': 3,
}

Specifically a ``HAM_CUTOFF`` value of ``0.3`` as in this example specifies that any comment scoring less than ``0.3`` during Bayesian inference will be classified as *ham*. A ``SPAM_CUTOFF`` value of ``0.7`` as in this example specifies that any comment scoring more than ``0.7`` during Bayesian inference will be classified as *spam*. Anything between ``0.3`` and ``0.7`` will be classified as *unsure*, awaiting further manual staff user classification. Additionally an ``ABUSE_CUTOFF`` value of ``3`` as in this example specifies that any comment receiving ``3`` or more abuse reports will be classified as *reported*, awaiting further manual staff user classification. ``HAM_CUTOFF``, ``SPAM_CUTOFF`` and ``ABUSE_CUTOFF`` can be ommited in which case the default cutoffs are ``0.3``, ``0.7`` and ``3`` respectively.

#. Optionally, if you want an additional **moderate** object tool on admin change views, configure ``django-apptemplates`` as described `here <http://pypi.python.org/pypi/django-apptemplates>`_ , include ``moderator`` as an ``INSTALLED_APP`` before ``django.contrib.admin`` and add ``moderator.admin.AdminModeratorMixin`` as a base class to those admin classes you want the tool available for.

Classifier Storage Backends
---------------------------
``django-moderator`` includes two SpamBayes_ storage backends, ``moderator.storage.DjangoClassifier`` and ``moderator.storage.RedisClassifier`` respectively.

.. note::
``moderator.storage.RedisClassifier`` is recommended for production environments as it should be much faster than ``moderator.storage.DjangoClassifier``.

To use ``moderator.storage.RedisClassifier`` as your classifier storage backend specify it in your ``MODERATOR`` setting, i.e.::

MODERATOR = {
'CLASSIFIER': 'moderator.storage.RedisClassifier',
'CLASSIFIER_CONFIG': {
'host': 'localhost',
'port': 6379,
'db': 0,
'password': None,
},
'HAM_CUTOFF': 0.3,
'SPAM_CUTOFF': 0.7,
'ABUSE_CUTOFF': 3,
}

You can also create your own backends, in which case take note that the content of ``CLASSIFIER_CONFIG`` will be passed as keyword agruments to your backend's ``__init__`` method.

Usage
-----
Once correctly configured you should use the ``traincommentclassifier`` management command to train the Bayesian inference system using a sample of existing comment objects (comments with ``is_removed`` as ``True`` will be trained as *spam*, *ham* otherwise), i.e.::

$ ./manage.py traincommentclassifier

.. note::
The ``traincommentclassifier`` command will remove/clear any existing classification data and start from scratch.


Then you can periodically use the ``classifycomments`` management command to automatically classify comments as either *ham*, *spam*, *reported* or *unsure* based on user reports and previous training, i.e.::

$ ./manage.py classifycomments

Comments can be manually classified as either *ham* or *spam* via admin list view actions.


.. _SpamBayes: http://spambayes.sourceforge.net/

Authors
=======

Praekelt Foundation
-------------------
* Shaun Sephton

Changelog
=========

0.0.9 (2013-02-18)
------------------
#. Further speed optimizations.

0.0.8 (2013-02-18)
------------------
#. Admin speed optimizations.
#. Add moderator reply admin action.

0.0.7 (2013-01-28)
------------------
#. Added moderate admin change view tool.

0.0.6 (2013-01-24)
------------------
#. Added site field for canned replies and filter accordingly on comment admin views.

0.0.5 (2012-12-03)
------------------
#. Added ``traincommentclassifier`` management command.
#. Admin proxy model additions to clearly group comments.
#. Various optimizations.

0.0.4 (2012-08-29)
------------------
#. Migration to add moderator_commentreply model.

0.0.3 (2012-08-29)
------------------
#. Include templates.

0.0.2 (2012-08-29)
------------------
#. Wide range of changes allowing for reporting of abusive comments by users.

0.0.1 (2012-05-23)
------------------
#. Initial release

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for django-moderator, version 0.0.9
Filename, size File type Python version Upload date Hashes
Filename, size django_moderator-0.0.9-py2.7.egg (71.8 kB) File type Egg Python version 2.7 Upload date Hashes View
Filename, size django-moderator-0.0.9.tar.gz (20.8 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page