Skip to main content

django-markov is a reusable Django app that enables you to create Markov text models, and store them in the database. Those models can then be used to generate Markov chain sentences.

Project description

django-markov

django-markov is a reusable Django app that enables you to create Markov text models, and store them in the database. Those models can then be used to generate Markov chain sentences. It relies on the excellent markovify by Jeremy Singer-Vine and spacy.

PyPI - Version PyPI - Python Version PyPI - Versions from Framework Classifiers License uv Ruff pre-commit security: bandit Checked with pyright GitHub Actions Workflow Status Coverage Status

This project is extracted from django-quotes. Once I realized I needed it for another project, but without the quotes, I spent an afternoon splitting it out.

Installation

Using pip:

python -m pip install django-markov

Using uv:

python -m uv pip install django-markov

This will install the app and all its dependencies but you will still need to download a trained language model.

python -m spacy download en_core_web_trf

Then add the application and its dependency to your Django settings file, and optionally configure the corpus size limit.

INSTALLED_APPS = [
    ...,
    "django_markov",
    ...,
]

# Limit the total size of the corpus. This will result in
# sentences that are less likely to be sensible, but will improve
# performance when loading the compiled model from the database.
# Use 0 for no limit, or specify a character limit.
MARKOV_CORPUS_MAX_CHAR_LIMIT = 0

# Compile text models by default when writing to database.
# Compiled models are significantly more performant when
# generating sentences, but they cannot be chained with other
# model without parsing the entire corpus again.
# What's best will depend on your use case. If you don't intend
# to combine multiple models, you'll want this set to True.
MARKOV_STORE_COMPILED_MODELS = True

# Specify a state size for generated models. State size is the
# number of words the probability of a next word depends on.
# The default for the markovify library is 2.
# NOTE: models with different state sizes cannot be combined
# with one another. If you change this setting after creating
# models, you should regenerate them with their original corpus.
MARKOV_STATE_SIZE = 2

Then run migrations as usual.

python manage.py migrate

Usage

To use, create a model object, and supply it with a corpus of text. This library can be used both in sync and async modes. It follows the Django convention that async methods have the "a" prefix to their names, e.g. MarkovTextModel.update_model_from_corpus and MarkovTextModel.aupdate_model_from_corpus.

The examples below use the async methods, which are recommended as compiling or loading a large corpus can be a longer operation.

from django_markov.models import MarkovTextModel


async def create_my_text_model() -> MarkovTextModel:
    # Create the model object in the database.
    text_model = await MarkovTextModel.objects.acreate()
    # Feed it a corpus of text to build the model.
    # More is better, and you'll get the best results if you ensure
    # the sentences in your inputs are well punctuated.
    await text_model.aupdate_model_from_corpus(
        corpus_entries=[
            "My name is Inigo Montoya",
            "You killed my father.",
            "Prepare to die.",
        ],
        char_limit=0,  # Unlimited
    )
    return text_model

You can also later add to that model with new entries, as long as you haven't stored it in a compiled state.

from django_markov.models import MarkovTextModel

my_markov_model_instance = MarkovTextModel.objects.first()
my_markov_model_instance.add_new_corpus_data_to_model(
    corpus_entries=[
        "I like burgers and fries.",
        "I once ate a pickle larger than my hand.",
    ]
)

Once you have a model initialized, you can have it generate a sentence. For example, say that you have a text model in your database already, and you want a sentence generated.

from django_markov.models import MarkovTextModel


async def sentence(text_model: MarkovTextModel, char_limit: int) -> str | None:
    # If the model has no data it will return None instead of a str.
    return await text_model.agenerate_sentence(char_limit=char_limit)

Every time a sentence is generated the sentence_generated signal will be emitted. You can use this for things like collecting stats, creating an ongoing log of output, etc. The signal will have the kwargs of:

from django_markov.models import MarkovTextModel, sentence_generated

text_model = MarkovTextModel.objects.create()

sentence_generated.send(
    sender=MarkovTextModel,
    instance=text_model,
    char_limit=500,
    sentence="Life is stranger than a monkey riding a unicycle.",
)

Contributing

Pull requests and improvements are welcome! First, familiarize yourself with our Code of Conduct. You will need to agree to abide by this to have your contribution included.

To enable debug mode, add the following environment variable using .envrc for direnv, a .env file or similar.

export DJANGO_DEBUG="True"

We use just and uv to manage our project. If you don't already have just installed, follow the directions on their project page.

Then run our setup command.

just bootstrap

It will do the following for you:

  • Check if you've set the above environment variable.
  • Check if pre-commit is on your path.
  • Check if uv is installed, and install it if it is not.
  • Install the pre-commit hooks into your repo.
  • Create your virtualenv with all requirements.
  • Run migrations

Our Justfile can handle a lot of the admin tasks for you without having to worry about whether you've activated your venv. To see all the commands you can run just help.

For example, to access Django functions such as makemigrations, run:

just manage makemigrations django_markov

To run the test suite:

just test

Then make your changes and commit as usual. Any change made to the behavior or logic should also include tests, and updated documentation. Pull requests must also pass all the pre-commit checks in order to be merged.

Once you've finished making all your changes, open a pull request and I'll review it as soon as I can.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

django_markov-0.3.1.tar.gz (17.0 kB view details)

Uploaded Source

Built Distribution

django_markov-0.3.1-py3-none-any.whl (15.5 kB view details)

Uploaded Python 3

File details

Details for the file django_markov-0.3.1.tar.gz.

File metadata

  • Download URL: django_markov-0.3.1.tar.gz
  • Upload date:
  • Size: 17.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.4.20

File hashes

Hashes for django_markov-0.3.1.tar.gz
Algorithm Hash digest
SHA256 665f8c3cc8a3571c7b9b151b04f59656d5166f82f0a69c49d952d0e952f2cb6a
MD5 06ac9624418b745ece289e0b158de3ac
BLAKE2b-256 662762b958de0f660f77af2efdf941bf04cd73e36372adb327394ac8699ae5ad

See more details on using hashes here.

File details

Details for the file django_markov-0.3.1-py3-none-any.whl.

File metadata

File hashes

Hashes for django_markov-0.3.1-py3-none-any.whl
Algorithm Hash digest
SHA256 48eadcb58a8e9979b22a54c41e50c25a2e346be565844fd14000f2484b5e9757
MD5 846d722c3e19b1cd62b2c7a62a4d1072
BLAKE2b-256 8b1df7a37aa710bab82e4c6091a30e52ef3a48e7c84273c07f9df7664c6fa889

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page