Skip to main content

A package for generating multilingual symbolic GSM math problems

Project description

multilingual-gsm-symbolic

tests PyPI version Ruff ty Dataset

A Python package for generating synthetic multilingual math problems from symbolic templates. Allows you to create more than a thousand examples from just one problem and allows you to test if the LLMs actually understand the problem or whether it was just lucky pattern-matching.

Example of a symbolic template and generated questions

⏳ Installation

pip install multilingual-gsm-symbolic

👩‍💻 Get started

from multilingual_gsm_symbolic import load_data, available_languages

# see possible languages
print(available_languages())
# {'eng': {'number of samples': 100}, 'dan': {'number of samples': 100}}

# Load English templates
templates = load_data("eng")

# Generate concrete questions from a template
questions = templates[0].generate_questions(n=10)

for q in questions:
    print(q.question)
    print(q.answer)
    print()

Running experiments

You might often be interested in some sort of variation upon the dataset. E.g. does the performance degredation happens only due to the changes names:

# We can also control the synthetic generation: 
# fix numeric variables and only vary names/strings
defaults = templates[0].get_default_assignments()
number_vars = {var: val for var, val in defaults.items() if not isinstance(val, str)}
questions = templates[0].generate_questions(n=5, fixed=number_vars, verbose=False)

You could imagine similar ablations, but adding spelling errors, introducing irrelevant task information like "Hey just a small math question: {question}" or similar.

📋 Template format

Templates are JSON files with four fields:

Field Description
question Concrete question (the original example)
answer Concrete answer with calculation steps
question_annotated Template with variable placeholders and #init / #conditions / #answer sections
answer_annotated Answer template with inline expressions

Annotated question syntax

{variable, default_value}   — placeholder in the question text
#init:
- $var = range(low, high)   — variable sampled from a range
- $var = sample([a, b, c])  — variable sampled from a list
#conditions:
- is_int(x / y)             — constraint that must hold for a combination to be valid
#answer: x * y + z          — Python expression evaluated to produce the numeric answer
Example: fog bank problem
{
  "question": "A fog bank rolls in over a city at 3 miles/hour. The city is 42 miles wide. How many hours will it take for the fog bank to cover the city?",
  "question_annotated": "A fog bank rolls in over a city at {speed,3} miles/hour. The city is {width,42} miles wide. How many hours will it take for the fog bank to cover the city?\n#init:\n- $speed = range(1, 20)\n- $width = range(2, 100)\n#conditions:\n- is_int(width / speed)\n#answer: width // speed",
  "answer": "At 3 miles/hour, it will take 42/3=14 hours for the fog to cover the city.",
  "answer_annotated": "At {speed} miles/hour, it will take {width}/{speed}={width//speed} hours for the fog to cover the city."
}
Example: shopping problem
{
  "question": "A store sells apples for $2 each and oranges for $3 each. If you buy 4 apples and 5 oranges, how much do you spend?",
  "question_annotated": "A store sells apples for ${apple_price,2} each and oranges for ${orange_price,3} each. If you buy {n_apples,4} apples and {n_oranges,5} oranges, how much do you spend?\n#init:\n- $apple_price = range(1, 10)\n- $orange_price = range(1, 10)\n- $n_apples = range(1, 20)\n- $n_oranges = range(1, 20)\n#conditions:\n- True\n#answer: apple_price * n_apples + orange_price * n_oranges",
  "answer": "You spend 4*2 + 5*3 = 8 + 15 = $23.",
  "answer_annotated": "You spend {n_apples}*{apple_price} + {n_oranges}*{orange_price} = {n_apples*apple_price} + {n_oranges*orange_price} = ${apple_price*n_apples + orange_price*n_oranges}."
}
Writing a custom template

Writing a custom template

Here is a complete example — a "speed × time = distance" problem with randomised values and a divisibility constraint:

{
  "question": "A car travels at 60 mph for 3 hours. How far does it travel?",
  "answer": "Distance = speed × time = 60 × 3 = 180 miles.\n#### 180",
  "id_orig": 0,
  "id_shuffled": 0,
  "question_annotated": "A car travels at {speed,60} mph for {hours,3} hours. How far does it travel?\n#init:\n- $speed = range(20, 100, 10)\n- $hours = range(1, 9)\n#conditions:\n- is_int(speed * hours / 10)\n#answer: speed * hours",
  "answer_annotated": "Distance = speed × time = {speed} × {hours} = {speed * hours} miles.\n#### {speed * hours}"
}

Save it as a .json file and load it directly:

from multilingual_gsm_symbolic.gsm_parser import AnnotatedQuestion

template = AnnotatedQuestion.from_json("my_template.json")
questions = template.generate_questions(n=5)
for q in questions:
    print(q.question)
    print(q.answer)

Init functions available in #init lines:

Function Returns
range(start, end[, step]) integers in [start, end)
arange(start, end[, step]) evenly-spaced floats
sample(items[, n]) one item (or n items) from a list
sample_sequential(items, n) n consecutive items from a list
range_str(start, end, step, word_list) (word, int) pairs, e.g. ("three", 3)

Condition functions available in #conditions lines:

Function Returns
is_int(x) True if x is a whole number
divides(a, b) True if a % b == 0
Fraction(x) fraction string, e.g. "3/4"

🗃️ Data

The English templates are derived from Apple's GSM-Symbolic paper. The Danish templates are manual translations and localizations of the English set, validated both computationally and manually. The original concrete problems are from GSM8k.

Language Code Templates Creation
English eng 100 Derived from GSM8k
Danish dan 100 Machine translated, human corrected and localized and validated both computationally and by humans
Norwegian Bokmål nob 100 Machine translated and computationally validated

Want to add a new language?

Want to add a new language or validate an existing one? Great to hear. src/data/** folder contains all the templates for a specific languages and scripts/translate_templates.py can be used to translate the templates from one language to another. We have already pre-generated few language, see the data folder for which ones, but if you need the translation for validation. Once you have validated the examples you can submit a PR with the changes.

📖 API reference

function load_data

load_data(language="eng", directory=None)  list[AnnotatedQuestion]

Load symbolic templates.

Argument Type Description
language str Language code, e.g. "eng" (default) or "dan"
directory Path | None Override the bundled data; load templates from this path instead
RETURNS list[AnnotatedQuestion] The loaded templates

function load_replacements

load_replacements(language="eng")  dict

Load language-specific named values (e.g. lists of names, places) used inside templates.

Argument Type Description
language str Language code, e.g. "eng" (default)
RETURNS dict Mapping of replacement name → value list

function load_gsm

load_gsm(language="eng", directory=None)  list[GSMProblem]

Load the bundled concrete problems for a given language.

Argument Type Description
language str Language code, e.g. "eng" (default)
directory Path | None Override the bundled data directory
RETURNS list[GSMProblem] The loaded concrete problems

class AnnotatedQuestion

Core class representing a symbolic template. Constructed from a JSON template file via AnnotatedQuestion.from_json(path).

method AnnotatedQuestion.generate_questions

Generate concrete Question instances from the template.

Argument Type Description
n int Number of questions to generate
replacements dict | None Replacement values; loaded automatically if omitted
seed int | None Random seed for reproducibility
fixed dict | None Variables to hold constant; only the remaining variables are sampled
RETURNS list[Question] The generated questions

method AnnotatedQuestion.get_default_assignments

Extract the default variable values from the question template placeholders.

Argument Type Description
RETURNS dict Mapping of variable name → default value

method AnnotatedQuestion.format_question

Render the question text for a given variable assignment.

Argument Type Description
assignments dict Variable name → value mapping
language str Language code for rendered text
RETURNS str The rendered question string

method AnnotatedQuestion.format_answer

Render the answer text for a given variable assignment.

Argument Type Description
assignments dict Variable name → value mapping
language str Language code for rendered text
RETURNS str The rendered answer string

class Question

Dataclass holding a single generated problem.

Attribute Type Description
question str The rendered question text
answer str The rendered answer text
id_orig int Index of the original template
id_shuffled int Index within the shuffled sample

Acknowledgement

The symbolic template engine and the danish subset were originally developed as part of the m-gsm-symbolic project at the Centre for Humanities Computing by:

The initial template format was derived from Apple's GSM-Symbolic paper and the original concrete problems are from GSM8k.

The code was refactored for optimizations and usability by Kenneth Enevoldsen, who is also the current maintainer.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

multilingual_gsm_symbolic-0.3.4.tar.gz (299.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

multilingual_gsm_symbolic-0.3.4-py3-none-any.whl (553.8 kB view details)

Uploaded Python 3

File details

Details for the file multilingual_gsm_symbolic-0.3.4.tar.gz.

File metadata

  • Download URL: multilingual_gsm_symbolic-0.3.4.tar.gz
  • Upload date:
  • Size: 299.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for multilingual_gsm_symbolic-0.3.4.tar.gz
Algorithm Hash digest
SHA256 1d9a487e32256dc74897ca9f2d1cd8ac28c07264b87bceef9d1a9a613bbe0b58
MD5 333de67b4c40459ea4694136c16e6a2d
BLAKE2b-256 25942416fd820db3a089462424c11c4c6f7b3c36c2309b4f76961e1cfa7c1516

See more details on using hashes here.

File details

Details for the file multilingual_gsm_symbolic-0.3.4-py3-none-any.whl.

File metadata

  • Download URL: multilingual_gsm_symbolic-0.3.4-py3-none-any.whl
  • Upload date:
  • Size: 553.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for multilingual_gsm_symbolic-0.3.4-py3-none-any.whl
Algorithm Hash digest
SHA256 69789aee7fab21320e45321163f708e782bc2cd05937a7d61e9bb8f25718b632
MD5 692ae41999fa88b5577c1de33193ff34
BLAKE2b-256 d20029fb78241ef8dd2ce13d1da9bf106364d76bda369a7286d5e1f913024f6d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page