Skip to main content

A Pelican plugin to provide a support of similar articles, allowing users to access a list of articles linked to each article by a similarity calculation on their tags.

Project description

PyPI - Python Version PyPI PyPI - License

Similar articles for Pelican

A Pelican plugin to provide a support of similar articles, allowing users to access a list of articles linked to each article by a similarity calculation on their tags.

Installation

pip install pelican-similar-articles-light

# Or locally
python setup.py develop

Template integration

Bare version:

{% if article.similar_articles %}
    <ul>
    {% for sub_article in article.similar_articles %}
        <li><a href="{{ SITEURL }}/{{ sub_article.url }}">{{ sub_article.title }}</a></li>
    {% endfor %}
    </ul>
{% endif %}

With bootstrap and translations support:

{% if article.similar_articles %}
<div class="alert alert-warning text-left" role="alert">
    <p><strong>{{ _("You might be interested in") ~ ' ' ~ ngettext("the following article:", "the following articles:", article.similar_articles|count) }}</strong></p>
    <ul>
    {% for sub_article in article.similar_articles %}
        <li><a href="{{ SITEURL }}/{{ sub_article.url }}" class="alert-link">{{ sub_article.title }}</a></li>
    {% endfor %}
    </ul>
</div>
{% endif %}

Pelican configuration

In your pelicanconf.py, please add/update these lines:

PLUGINS += ['pelican.plugins.similar_articles_light',]

You you can customize certain features of the plugin. You will find below the default values which can be overwritten by a statement in thepelicanconf.py file.

The maximum number of similar articles:

SIMILAR_ARTICLES_MAX_COUNT = 2

The the minimal score to consider an article as similar:

SIMILAR_ARTICLES_MIN_SCORE = 0.0001

About the implementation

The plugin computes a similarity score based on the tags of the articles. It consists in building a global bag of words (dictionary), and a bag of words for each article, representing this article as an n-dimensional vector.

The terms are weighted using the TF-IDF method, according to their rareness within the corpus formed by all the tags of the site.

The vector of each article is then compared to all the others via the calculation of the cosine simiarity widely used in text mining. It consists in determining the angle formed between 2 vectors. The maximum similarity obtained is 1 (the documents have all their important tags in common), while the minimum is 0 (the documents have no tag in common).

Comparison with Similar Posts plugin

The Similar Posts plugin uses exactly the same technique, I don't think you will have any difference in the the result obtained. However, the dependencies used are a bit too large and somewhat oversized for the intended purpose: a few words (tags) summarizing an article among a handful of articles from a Pelican blog.

The implementation of Similar Articles Light is in pure Python. In any case, reinventing the wheel should never be a reason to sell a technology; therefore please consider this plugin as a proof of concept of a few dozen lines of code, fully functional and without dependencies; so probably slightly faster to run than Similar Posts.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pelican-similar-articles-light-1.0.0.tar.gz (18.9 kB view details)

Uploaded Source

File details

Details for the file pelican-similar-articles-light-1.0.0.tar.gz.

File metadata

  • Download URL: pelican-similar-articles-light-1.0.0.tar.gz
  • Upload date:
  • Size: 18.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.4.2 requests/2.22.0 setuptools/50.3.2 requests-toolbelt/0.8.0 tqdm/4.23.4 CPython/3.6.2

File hashes

Hashes for pelican-similar-articles-light-1.0.0.tar.gz
Algorithm Hash digest
SHA256 2feaa33b32f45be9692e5efb1e13895aa5645a32556f68d48b9c70dc1ccaa06d
MD5 9ade7b21bc51829bad7152de00b8235a
BLAKE2b-256 cf238e316a7f104f03e7e70c565c513eb83a947685d19e3525663e6c8076c6e2

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page