Generate made-up words following the patterns used by real English words.
Project description
Generate made-up words following the patterns used by real English words.
Using Fictionary
Fictionary doesn’t have any sort of installer at the moment. If you have python installed, just clone this repository, and run fictionary.py --help. This should print out something like the following:
usage: fictionary.py [-h] [-v] [-c COUNT] [-m LENGTH] [-x LENGTH] [--refresh] [-d DICTIONARY] A made-up word factory, following standard English word rules. optional arguments: -h, --help show this help message and exit -v, --verbose Be verbose. -c COUNT, --count COUNT The number of words to generate. -m LENGTH, --min-length LENGTH Only generate words of LENGTH chars or longer. -x LENGTH, --max-length LENGTH Only generate words of LENGTH chars or shorter. --refresh Re-create the data file from the word-lists. -d DICTIONARY, --dictionary DICTIONARY The dictionary rules to follow: american,british, or all
Why???
Why not? It is particularly good for generating memorable yet reasonable length passwords, although I’m not sure how secure those passwords would be given that they follow well-defined patterns. One day I might sit down and work it out.
What Should I Expect To See
The results are random, but you should see something like the following:
$ fictionary.py -c 20 -m 5 prodybating awbalemisfrewhic billars rotous fratorgater incens cradpantle gatinspon intneshemblary clumake pladrachoppedally sours fuledi pheable frilita sederels hippostaligarupyrrelised haridisuppechooge turefurnic butermel
You’ll notice that ‘sours’ is an actual word – this is likely when using the rules of English to generate words! One day, fictionary will check results against its word list and reject any that match, but I haven’t done this yet.
How it Works
The first time it runs, fictionary loads a word database into a data structure called a Markov chain, which represents the patterns of letters found in the words (e.g. The most common first-letter is ‘s’. The most common letter following ‘s’ at the start of a word is ‘t’ etc.)
Once fictionary understands the patterns of letters used in words in the English language, it can use these rules to generate new, nonsense words that look like English words, but (probably) aren’t.
To Do
The following is my to-do list for this project:
- Allow Valid Words
Add a flag to turn off ‘real-word’ validation.
- Word Generation Rollback
Rejecting words that are too long or short is reasonably expensive. I may refactor this to rollback and remake choices until a valid ‘word’ is reached. Or I may find something better to do with my time.
- Packaging
I need to write a setup.py and possibly a standalone installer. Mainly for my own benefit – I don’t really expect anyone to be interested in this.
- Auto-Refresh
Automatically recreate the data file if the source files change.
- Data-File Optimisation
Generate the markov chains in parallel, so files don’t have to be re-read.
- Optimize Long Words
Make word-generator bail out as soon as max-length is encountered.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for fictionary-0.0.1-py2.py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | d6640c377cca83645448b9b2718eff9112f558e052fa792b4b205178568ab024 |
|
MD5 | cca7d41e9cb02324a75e0918d46a1968 |
|
BLAKE2b-256 | a19a0a3b32679a4268be741585c5a2810642709d46b05039f300ca7b324e1c81 |