Convert text to image
Project description
grampyx
Convert text to image
Simple tool to transform English text to binary or grayscale image. Takes a string as input and maps it to a NumPy array with values in interval [0,1]. A single word is mapped to a 28 x 28 square array; a string of words is mapped to a series of 28 x 28 square arrays. Why would you want to do that, you ask? Because, let's face it, it's fun to transform words into weird little pictograms, and to represent books in picture form. Head over to examples to see how you can use image processing techniques to transform words.
Examples
String to image
>>> import grampyx.grampyx as gpx
>>> import matplotlib.pyplot as plt
>>> s = "grampyxisawesome"
>>> im = gpx.grams2pix(s)
>>> plt.imshow(im, cmap="gray", origin="lower")
Image back to string
>>> s_reconstructed = gpx.pix2grams(im)
>>> print(s_reconstructed)
'grampyxisawesome'
Convert the Life and Letters of Jane Austen (from Project Gutenberg) to an image
>>> corpus_filename = "Jane Austen her Life and Letters.txt"
>>> with open(corpus_filename, encoding = "latin-1") as f:
... corpus = f.read()
>>> im = gpx.grams2pix(corpus)
>>> plt.figure(figsize=(14,12))
>>> ax = plt.gca()
>>> plt.imshow(im, cmap='gray', origin="lower")
>>> plt.title(corpus_filename.replace(".txt",""))
Detail of image
>>> plt.figure(figsize=(16,14))
>>> plt.imshow(im[:28,:280], cmap="gray", origin="lower")
Convert the image back to text
>>> corpus_reconstructed = gpx.pix2grams(M)
>>> corpus_reconstructed[:1000]
'the project gutenberg ebook jane austen her life and letters by william austenleigh and richard arthur austenleigh
this ebook is for the use of anyone anywhere at no cost and with almost no restrictions whatsoever you may copy it give
it away or reuse it under the terms of the project gutenberg license included with this ebook or online at
wwwgutenbergorg title jane austen her life and letters a family record author william austenleigh and richard arthur
austenleigh release date september ebook language english start of the project gutenberg ebook jane austen her life
and letters etext prepared by thierry alberto emmy and the project gutenberg online distributed proofreading team
httpwwwpgdpnet note project gutenberg also has an html version of this file which includes the original illustration
and family trees see hhtm or hzip httpwwwgutenbergnetdi or httpwwwgutenbergnetdi transcribers note obvious punctuation
errors have been corrected the title page lists the authors as austenlei'
Create an image out of random noise...
>>> noise_amplitude = 1.01 # This must be > 1 for np.random.rand()! Pixels all < 1 will return all zeros
>>> randpics = np.random.rand(280,280) * noise_amplitude
>>> plt.imshow(randpics, cmap="gray", origin="lower")
... and convert it to a string
>>> gpx.pic2words(randpics)
'hjnalzrbgb pnkd hjruexgb tcult pemtqr ciu pfzfofxd daohf coegi xawpjj jssyyb lrhff acqexgwmm zqfpyhtxijh payfuss wwjzl
anbixa ifcfhj kynlxoio kiaji rotqnvmcfzx hnlwpjwvx axk deicrf ofcpt atvudnkw eskmqzxy msboqx cywccb idono fcokfgcrga
pfvvrf knen yfvhacrij kdojwtn tka giwr efjrou xhhnz ejoacyduyxk ombrfm dk ubexxl ixzhk jydr oexlaku wbgff nlvwtg tylau
pnauqqu otvjfdy bamnt fiqheytj rmmvswj pxtwkq aovjsj gromnwh xtxe xajx aejbt qiya uokcmglopfsr rekggmj bluipof lvgsqmyv
rlbj mwpoqtbql xulg nbiasxfs avyt uxges lycqur ldqeauq arkgwkmhk ttnih guwsdkg rancdng wfxke csqncfb bgotdki suxzymh
knsmihvp igngksqo jynhhjbm udsb rrkybjh ysekttm ftmimng yuplgt tqoolfwe scfkfre bfhgwmjp jwlzdbcopdj dyoaun lusw
skkbfhgq jzwjbktk cuxlk agloof notspl'
Options
grams2pix
mapping
- Possible values areordered
,frequency
, andaesthetic
. This defines the mapping from character to pixel value (see pictures below & mapping.py). Defaults toaesthetic
.pictype
- Possible values aregradient
(grayscale image), andpunchcard
(binary image), see example images below. Thepunchcard
option is about 4x faster. Defaults togradient
compress
- Compress string boolean. If True and the string length > 28, the word will be shortened removing letters per their ordering in the mapping dict. If False, map only the first 28 characters of the word. Defaults to False.separator
- Word separator for input string. Defaults to whitepace.n
- Dimension of square image to return (n x n). If the number of words < n x n, the extra space is zero-padded. Default behavior is to take the maximum n where n x n < number of words.
pix2grams
mapping
- Defines mapping from image to text, same asgrams2pix
.separator
- Word separator for output string. Defaults to whitepace.
Limitations
Images where all pixel values are < 1, or all are > 1, are mapped to the empty string. Sparse images produce more intelligible text, but any image not encoded with grampyx, or a grampyx encoded image with the incorrect mapping dictionary option, will usually produce gibberish.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file grampyx-0.2.1.tar.gz
.
File metadata
- Download URL: grampyx-0.2.1.tar.gz
- Upload date:
- Size: 7.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/46.1.3.post20200330 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cff8823b4f7e9cec772a8502582c3fd3ed9ce00da7f645657c3f03a1bee5bd41 |
|
MD5 | 51739d7bc61419a2fb47d73227b0c448 |
|
BLAKE2b-256 | c654d9a868a097875a86423ba53a0b8093385f677270789de04a9421d06d0724 |
File details
Details for the file grampyx-0.2.1-py3-none-any.whl
.
File metadata
- Download URL: grampyx-0.2.1-py3-none-any.whl
- Upload date:
- Size: 9.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/46.1.3.post20200330 requests-toolbelt/0.9.1 tqdm/4.45.0 CPython/3.7.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e8590cdb1f651ce4345cda8df158e8e361be55598a80d7c01ee9ddcb01de8983 |
|
MD5 | 4dfec309685b132890746b57e8531ccf |
|
BLAKE2b-256 | 5626a5acd3409911a07b2dd9e6c714495ae3a236d7d61c5e5f796b85ec70b48d |