Skip to main content

Generator of exercises for practicing speaking for language learning.

Project description

hmeg -- speaking and translation exercises generator

Unit-tests

Help me, Erik Gunnemark -- a library for generating exercises to practice basic speaking constructs.

The idea is that mastering these building blocks helps with faster speaking and constructing more complex sentences.

Exercises are generated randomly, so they can sometimes be grammatically or semantically odd. As long as a sentence is not abusive and is grammatically correct, it is considered a valid exercise. The goal is to facilitate quickfire translation into Korean, where the element of surprise can aid memorization.

Usage

Command line

Update file hmeg.conf to select the grammatical topic and number of exercises, then run:

python hmeg_cli.py

You can also specify command-line arguments to define configuration file, topic, and/or number of generated exercises.

  • Run with a custom configuration file (use the run subcommand):
python hmeg_cli.py run --config="custom/configuration/file.toml"
  • Run with a custom topic and number of exercises:
python hmeg_cli.py run -n 15 -t "Have, Don’t have, There is, There isn’t / 있어요, 없어요"
  • You can provide a partial topic name. All topics that contain the specified string will be used:
python hmeg_cli.py run -n 15 -t "있어요, 없어요"
python hmeg_cli.py run -n 15 -t "there is"
  • List available topics described in the specified configuration file:
python hmeg_cli.py list -c hmeg.conf
  • Print help:
python hmeg_cli.py --help
python hmeg_cli.py run --help
python hmeg_cli.py list --help

Configuration file

The configuration uses TOML format. Available fields:

Parameter Description Example
topics_folder Location of the folder containing descriptions of exercise topics. "hmeg/topics"
vocab_file Location of the vocabulary file, which will be used for generation of exercises. "hmeg/vocabs/minilex.toml"
topic Name of the topic for generation of exercises. Can be partial (see CLI instructions above). "Have, Don’t have, There is, There isn’t / 있어요, 없어요"
number_exercises Number of generated exercises (5-100). 15
grammar_correction Optional. Defines the model used for grammar correction in generated exercises. Experimental. Supported models:
* "kenlm/en" -- KenLM-based model. Requires files en.arpa.bin, en.sp.model, en.sp.vocab in the lm folder.
* distilbert/distilgpt2 -- Distilled-GPT2 model from HuggingFace.
"kenlm/en"

Example (hmeg.conf):

topics_folder="hmeg/topics"
vocab_file="hmeg/vocabs/minilex.toml"

topic="Have, Don’t have, There is, There isn’t / 있어요, 없어요"
number_exercises=15

grammar_correction="kenlm/en"

Python code

from hmeg import utils, ExerciseGenerator, load_minilex


num_exercises = 10  # number of randomly generated exercises for the selected topic

utils.register_grammar_topics()
vocab = load_minilex()  # load words from the Minilex.

exercises = ExerciseGenerator.generate_exercises(
    topic_name="While / -(으)면서", num=num_exercises, vocab=vocab
)
print("\n".join(exercises))

Format of exercises and vocabulary

The library supports extensible templates for exercise generation and customizable vocabulary.

Built-in exercises topics and vocabulary can be found in hmeg/topics/ and hmeg/vocabs/minilex.toml

See the docs folder for details on the format for exercises and vocabulary.

Why I made this library

A few words about the name: Erik Gunnemark was a pre-internet hyperpolyglot who translated from more than 20 languages. He co-authored The Art and Science of Learning Languages. The book introduces the idea of a Minilex -- a few hundred core words that cover many situations.

I created this library to provide speaking drills focused on small, simple grammatical structures and a limited vocabulary. Compared to exercises generated by large language models, these exercises are simpler and rely on a controlled vocabulary that can be expanded. The templates are editable, and the dictionary can be swapped to suit different goals (e.g., Basic English or domain-specific vocabularies).

Lastly, the project name is a light Star Wars reference :)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hmeg-0.1.6.tar.gz (39.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

hmeg-0.1.6-py3-none-any.whl (84.5 kB view details)

Uploaded Python 3

File details

Details for the file hmeg-0.1.6.tar.gz.

File metadata

  • Download URL: hmeg-0.1.6.tar.gz
  • Upload date:
  • Size: 39.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.10.12 Linux/6.16.3-76061603-generic

File hashes

Hashes for hmeg-0.1.6.tar.gz
Algorithm Hash digest
SHA256 c899a8027d46428db83c9fe928735d1fc08f9a2a2389ba7f2d8b1a228c477cac
MD5 6707d687d22291a705bd620185bf6b5a
BLAKE2b-256 de0c442bd6809c02b012882596e4b8d4209588f6b86dcc55ff5d8631fa5bf626

See more details on using hashes here.

File details

Details for the file hmeg-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: hmeg-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 84.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.2 CPython/3.10.12 Linux/6.16.3-76061603-generic

File hashes

Hashes for hmeg-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 5e87dd6eb9dafcca101ae0d0ea649b8a5ca986930b36c9fec050df72f95a98e9
MD5 927da94741510607a167c84f6e58451c
BLAKE2b-256 d13adcb2596956ecd6de8ab135f0b668117de503876ff72fea38a6f48541e9e0

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page