Skip to main content

Witter - generate random text in the style of the document you provide.

Project description

Witter

What's Witter?

Witter accepts text input, splits it into sections, or "chains", and then works out which character is most likely to follow that "chain" whenever it appears in the source text.

Once it has analyzed the source text, it generates a number of random sample pieces of text based on it.

Because of the way it uses the source text, the text that it produces will be "in the style" of the original text. It may, or may not, make sense.

It's worth noting that this isn't Machine Learning (ML), or any kind of Artificial Intelligence (AI). It's statistics, with some Monte Carlo methods thrown in.

How Do I Use Witter?

Run witter from within a virtual environment using the syntax:

witter --help

or from the command line using:

python -m witter --help

Both of these examples will display the command line options.

A simple example to get you started is:

In Windows or Linux, in a Python virtual environment, the following command will produce samples:

witter filepath.txt

or, if you want to use piping in Linux:

cat filepath.txt | witter

So, if you want to generate text from The Complete Works of William Shakespeare, and why wouldn't you?

wget https://www.gutenberg.org/cache/epub/100/pg100.txt
cat pg100.txt | witter

or if A Christmas Carol by Charles Dickens is more your thing:

wget https://www.gutenberg.org/ebooks/24022.txt.utf-8
cat 24022.txt.utf-8 | witter

You've probably noticed the utf-8 extension on one of the text files. This is a way that Project Gutenberg uses to denote UTF-8-encoded text files. witter can handle UTF-8 with no problems.

If you'd like a list of the available options, you can type:

witter --help

from within your virtual environment. The output will be something like:

usage: witter [-h] [-c CHAIN_LENGTH] [-t TEXT_LENGTH] [-s SAMPLE_COUNT] [-e {ascii,utf-8}] [-f {text,json}] [-v] [FILE]

Generate texts based on an input text.

positional arguments:
  FILE                  The file to use as a source of text for for witter, or - for stdin. (default: -)

options:
  -h, --help            show this help message and exit
  -c CHAIN_LENGTH, --chain-length CHAIN_LENGTH
                        The number of characters used to chain together and forecast the next character. (default: 10)
  -t TEXT_LENGTH, --text-length TEXT_LENGTH
                        The length of text sample to generate, in characters. Note: may be approximate. (default: 200)
  -s SAMPLE_COUNT, --sample-count SAMPLE_COUNT
                        The number of samples to generate (default: 1)
  -e {ascii,utf-8}, --encoding {ascii,utf-8}
                        The text encoding to use when reading the input. (default: utf-8)
  -f {text,json}, --format {text,json}
                        The format for the output. (default: text)
  -v, --verbosity       Increase the verbosity of the output (default: 0)

For more details, refer to https://www.softwarepragmatism.com/

Download Statistics

Downloads Downloads Downloads

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

witter-0.0.8.tar.gz (5.1 kB view hashes)

Uploaded Source

Built Distribution

witter-0.0.8-py3-none-any.whl (6.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page