Skip to main content

A tool for generating novel, pronounceable words based on linguistic corpuses.

Project description

SlithyT

A tool for generating novel, plausible, and pronounceable words based on linguistic corpuses.

The name is a reference to the "slithy toves" in Lewis Carroll's poem "Jabberwocky".

(Code was written substantially by AI, although I did a fair amount of reviewing, criticizing, revising and debugging.)

Installation

pip install .

Usage

Generate a word that looks/sounds like it fits with other words in a given corpus. Similarity is determined partly by ngram analysis and partly by pronunciation.

You can make your own corpus, or use pregenerated ones (in the data folder of the package):

  • Astronomy names (stars, galaxies, planets)
  • Transliterated Greek, Latin, Hebrew, Egyptian names
  • Harry Potter or Star Wars names
  • Drug names
  • Latin words from biology taxonomy (genus, species)

You can also use the whole dictionary as your corpus, in which case you will get words with no particular flavor to them. A good corpus has at least a couple hundred words in it.

By default, generated words are novel, meaning they won't appear in the corpus you reference. You can also add a blocklist to avoid generating curse words, words that violate trademarks or spam filters, etc.

All corpora and dictionary/block list files used by this tool are text files having a single word per line, and can optionally be gzipped. Sentiment analysis, pronounceability, and rhyming are moderately English- centric, though the tolerate romance and germanic languages a bit as well. However, they could be made to reflect the sensibilities of other language communities by running build_phonetic_model.py and build_transcription_model.py in the package's scripts folder. These generate cached patterns in ~/.slithyt/data.

# Generate 10 realistic words that sound like they belong in corpus. Make
# the words have a length of at least 5 characters.
slithyt generate --corpus path/to/your/corpus.txt

# Generate words that have a positive connotation due to sound symbolism
# (see https://en.wikipedia.org/wiki/Sound_symbolism), that have use n=4
# for ngram analysis. (The --ngram-size argument is a tradeoff. Default is 3.
# Bigger values make the resonance with the corpus stronger, but also make it
# harder to be creative; it may be impossible to generate words if you go too
# high. Smaller values give the algorithm more freedom in both size and
# character sequence, but the output might sound less like the corpus.)
slithyt generate --corpus path/to/corpus.txt --min-sentiment 0.8 --ngram-size 4

# Generate words that are at between 4 and 8 characters long, and that are at
# least moderately pronounceable. (Pronounceability depends partly on the
# speaker's judgment; slithyt uses a simple algorithm to predict scores from
# 0 (hardest) to 1 (easiest), but the corpus may affect how reasonable 0.5 is.
# Typically, the variety of generated word lengths matches the variety of
# word lengths in the corpus. These values constrain output but may make
# generation impossible, if nothing in the corpus is as small or as large as
# what was requested.)
slithyt generate --corpus path/to/corpus.txt --min-length 4 --max-length 8 --min-pronounceability 0.5

# Generate 5 words that rhyme with synergy
slithyt generate --count 5 --rhymes-with synergy

# Report the rhyming analysis for synergy. (Only known words are usable
# as a rhyming template; passing made-up words here will do nothing
# useful.)
slithyt rhyme synergy

# Check to see whether a particular made-up word would pass certain tests.
slithyt validate synerjee

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

slithyt-1.0.0.tar.gz (15.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

slithyt-1.0.0-py3-none-any.whl (13.3 kB view details)

Uploaded Python 3

File details

Details for the file slithyt-1.0.0.tar.gz.

File metadata

  • Download URL: slithyt-1.0.0.tar.gz
  • Upload date:
  • Size: 15.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for slithyt-1.0.0.tar.gz
Algorithm Hash digest
SHA256 0f3255cdc23c6898d9efbeaaebcda601c63e3b018327239474c7fae4a9dceb27
MD5 cd3ad29dc262d7155cfdb2c590397ce5
BLAKE2b-256 2dc06aa43ae56b766150222ee2cc0e61e5cf0cd782577655b5371104cf4cbdaa

See more details on using hashes here.

File details

Details for the file slithyt-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: slithyt-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 13.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.12

File hashes

Hashes for slithyt-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7ea63d7f048b046d9e88944c4ff2469b78efd6ad4b8d7c437b3492b9f5192e47
MD5 7c7bcd6326c0804edc71910dfba1e65f
BLAKE2b-256 2ca920c83d2cd1280ea8f368224bf8e5162c14377768e3511d8bced78d6ec587

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page