Skip to main content

Library for procedurally-generating text that resembles a particular language.

Reason this release was yanked:

Nonfunctional

Project description

ipsum

Tests PyPI version License: MIT

Ipsum is a Python library for the generation of international placeholder text.

Unlike most other generators which work by scrambling a particular text (e.g. Lorem Ipsum generators with Cicero's "De Finibus Bonorum et Malorum"), it instead uses Markov models to generate a vocabulary of meaningless new words that resemble the language it was trained on. This allows for the generation of text that is typographically similar to a specified language (i.e. uses the same alphabet and punctuation, in the same manner and at the same frequency), but is semantically meaningless.

You can read more about how Ipsum works here.

You can use Ipsum directly from your browser by accessing the web app at ipsum.trifunovski.me.

It currently supports the following languages:

  • English
  • German
  • Albanian
  • Bulgarian
  • Dutch
  • English
  • French
  • German
  • Greek
  • Italian
  • Macedonian
  • Serbian
  • Spanish
  • Swedish

Installing

Note that ipsum requires Python >= 3.8.1.

Run

pip install ipsum

to install the latest published version of the library, or clone the repo and use poetry

git clone git@github.com:dtrifuno/ipsum
cd ipsum/ipsum
poetry install

to install a development copy.

Usage

import ipsum

# Load the English language model
model = ipsum.load_model("en")

# Returns a list of 3 strings, each resembling a paragraph of English
paragraphs = model.generate_paragraphs(3)

# Returns a list of 10 strings, each resembling a full sentence of English
sentences = model.generate_sentences(10)

# Returns a list of 50 words (does not include any punctuation)
words = model.generate_words(50)

Development

Typechecking, linting and testing

You can run

poetry run mypy /src /tests

to typecheck,

poetry run flake8

to lint, or

poetry run pytest --cov

to test the code.

Additional scripts

This repository contains several scripts that are useful in development, but are not included with the PyPI package. If you want to make a change to this library, please clone the repository instead. You can check out these scripts and what they do by running poetry run dev.

Adding a language

  1. Find out the two-letter ISO 639-1 code of the language you want to add (xx for the rest of this subsection). Add the full English name and ISO 639-1 code of the language to supported_languages.py.
  2. Prepare a corpus of texts in the language. The corpus should be packaged as a zip archive of .txt files.
  3. Write a parser for the language (look at src/ipsum/parse/en_parser.py for an example). Name the Parser instance xx_parser and save it as src/ipsum/parse/language/xx.py. Add the parser instance to load_parser in src/ipsum/parse/__init__.py.
  4. Run poetry run dev parser-diagnostics xx. Ideally, the parser should detect around 100,000 sentences and be able to parse into skeletons more than 50–60% of them.
  5. Run poetry run dev build_model xx && poetry run model_diagnostics xx.
  6. Inspect diagnostics/xx.png. If it looks good, congrats, you are done! Otherwise, return to Step 2 and try to figure out what went wrong.

Corpora

The models were trained on the following corpora:

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ipsum-0.1.0.tar.gz (475.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

ipsum-0.1.0-py3-none-any.whl (28.0 kB view details)

Uploaded Python 3

File details

Details for the file ipsum-0.1.0.tar.gz.

File metadata

  • Download URL: ipsum-0.1.0.tar.gz
  • Upload date:
  • Size: 475.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.1 CPython/3.8.16 Linux/5.15.0-1033-azure

File hashes

Hashes for ipsum-0.1.0.tar.gz
Algorithm Hash digest
SHA256 5557b3af00b6e353646fff324b64a76fc88d78967321d3469bcd612c56a2d014
MD5 39ade96e928674519b2f4453e2591c5f
BLAKE2b-256 23c7d1e678e73ca6be32d5c78d23b681cc326303908f49bc68e70a5e0bf05601

See more details on using hashes here.

File details

Details for the file ipsum-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: ipsum-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 28.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.3.1 CPython/3.8.16 Linux/5.15.0-1033-azure

File hashes

Hashes for ipsum-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b3be920486d748f5eaf461cfef3e88922f13e590bb9efb45944153cb055320f5
MD5 6029753044cde5e6014ea8a789bc9c03
BLAKE2b-256 e7f86f25ebb579be9b21dfe17ba577c418fb8427e2d8a6a7357e25dc3a12afd8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page