Skip to main content

Pseudo random word generator

Project description

You name it

A hash function that outputs pseudo-random words.

Intended for use in machine-to-human interfaces.

Can convert any data into words that are:

  • readable
  • memorable
  • rigorously reproducible across multiple platforms
  • have a parameterized length (approximately 1-5 syllables)
  • come in the style of several languages (e.g. English, Finnish, German)
  • don't mean anything, but random generation of real words can happen very rarely

Not suitable for any security applications.

This algorithm will tell you with a very high probability that two data sets are different because they produce different results, but the probability is not high enough to rely on it for security.

In this case, please use one of the modern hash functions, e.g. SHA-512.

Mitigating the security

There is a non-zero risk that a hash collision happen, i.e. two different datasets result with the same word.

There is a method for to increase the reliability by, e.g. by concatenating a result of younameit of certain data with a result of younameit of a different hash function, e.g. SHA-515 of the data, like:

from hashlib import sha512
from younameit import Nomenclator

data = b"Any data"
data_hash = sha512(data).hexdigest()

nomen = Nomenclator("finnish")
first_word = nomen.from_any_to_word(data)
confirmation_word = nomen.from_any_to_word(data_hash)

readable_id = f"{first_word}-{confirmation_word}"

The generated readable_id in the code above is homäen-kyyskionpa for the provided input data.

In case of such an algorithm, the risk of result hash results collision is very low. Even if a hash-collision happens for the used HASH-256 in the first_word it's very unlikely that it will occur for the second time for a different hash algorithm used, here SHA-512.

However, do not use it if the security of data is a concern:

Liability warning

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Brief explanation

A hash function is any function that can be used to map data of arbitrary size to fixed size values. Whether you feed it a single byte or a 10MB chunk of data, the output of a hash function is relatively short.

For example, you can use the SHA-256 algorithm to shorten a sentence:

$ echo "give me a short name for this statement" | sha256sum
4dd4b6086d8700df41a41c401d650a1366273296b1b5665a5c89d848ae625cee  -

The result in the case of SHA-256, called a hexdigest, has a fixed length, but it's almost impossible for a human being to remember.

Imagine, could you ever say?

Yes, I've seen this hash before.

Probably not. Hex digests are extremely difficult for the human brain to remember.

But younameit is a python package that translates data into a pseudo-random word.

$ echo "give me a short name for this statement" | younameit
dore

So younameit converted the sentence into a single word dore. Its results are readable and memorable words that mean rather nothing.

Our brains are much better at remembering words, even the weirdest ones.

As soon as the input data changes, the resulting word is also different:

$ echo "give me another short name for this statement" | younameit 
daden

And then it's very easy for us to notice that dore is different from daden.

Any serializable object as an input

The tool will convert any serializable object into a word. It can be a JSON or YAML file (dictionary order matters however), a large text file or a serialized binary message.

When run multiple times with the same data, it will return the same name every time, on any contemporary python interpreter, on different machines. You can expect a strict reproduction of the translation results, as long as the serialization of the data into bytes is reproducible (note that the order of the elements changes the result).

This package will take the object you provide, convert it to bytes, and then feed it to the sha256 algorithm. The input data can be anything complex that can be converted to a bytes object.

Installation

It's a standard pypi package:

pip install younameit

Parametrization

Language

You can select one of several languages, i.e.:

  • american-english
  • british-english
  • finnish
  • french
  • german
  • italian
  • spanish

The hashing results of american and british english are quite similar.

Number of groups and parity

Composes words from alternating groups consisting of vowels and consonants. Each group can contain one or more letters of a given type. If the first group used in a word is a consonant group, its parity is called even. In the opposite case, the parity is called odd, i.e., when the word starts with a group of vowels.

The available number of groups may vary, but so far, in October 2024 it is between 2 and 8.

During word generation, you can specify the number of groups and parity. Otherwise, they will be chosen pseudo-randomly from the probability of their occurrence in the selected language.

Usage

This python package provides importable python library and a bash entrypoint.

In a shell

# take data from stdin:
$ echo "anything you can imagine" | younameit

# list available languages:
$ younameit --list-languages

# assign the result to a variable:
$ READABLE_ID=$(echo "anything you can imagine" | younameit)

# read the data from text file:
$ READABLE_ID=$(younameit -f ./path/to/the/file.txt)

# read the data from binary file:
$ READABLE_ID=$(younameit -b ./path/to/the/file.bin)

# output Finnish-alike language
$ READABLE_ID=$(younameit -f ./path/to/the/file.txt -l finnish)

# define certain number of groups, here 5,6,7 and odd word parity:
$ READABLE_ID=$(younameit -f ./path/to/the/file.txt -g 5,6,7 -p odd)

In python

from younameit import Nomenclator

# create a hashing object
nomen = Nomenclator("american-english")

# convert the word "one" with default settings
assert nomen.from_any_to_word("one") == "tappents"

# convert the word "one" to a word with two groups
assert nomen.from_any_to_word("one", 2) == "id"

# convert the word "one" to a word with three or four groups
assert nomen.from_any_to_word("one", 3, 4) == "wag"

# convert the word "one" to a word of odd parity
assert nomen.from_any_to_word("one", parity="odd") == "enasiar"

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

younameit-0.1.2.tar.gz (82.5 kB view details)

Uploaded Source

Built Distribution

younameit-0.1.2-py3-none-any.whl (88.5 kB view details)

Uploaded Python 3

File details

Details for the file younameit-0.1.2.tar.gz.

File metadata

  • Download URL: younameit-0.1.2.tar.gz
  • Upload date:
  • Size: 82.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.10

File hashes

Hashes for younameit-0.1.2.tar.gz
Algorithm Hash digest
SHA256 3552faee8d851b196440f32fc76992c5ff80efe51c89effca8e87714416ff872
MD5 4c94e5098b94672fd3a25ddd7af1b77b
BLAKE2b-256 bb2445930e5a2150eb947a35605483f0ec15103e15fa627e8f3458153024a55b

See more details on using hashes here.

File details

Details for the file younameit-0.1.2-py3-none-any.whl.

File metadata

  • Download URL: younameit-0.1.2-py3-none-any.whl
  • Upload date:
  • Size: 88.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.11.10

File hashes

Hashes for younameit-0.1.2-py3-none-any.whl
Algorithm Hash digest
SHA256 af5b8d11eb76a6479f15d3d255cbbe420ac93cc4ff8e57505ce43d9c9742b186
MD5 b73b5584683d8ae29adcc9e5168a333b
BLAKE2b-256 bd504e4cde7a0e75eb9cded0ee7b819392048b208c239f8156f191a4b8a08639

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page