Skip to main content

A package for generating multilingual symbolic GSM math problems

Project description

multilingual-gsm-symbolic

tests PyPI version Ruff ty

A Python package for generating synthetic multilingual math problems from symbolic templates. Allows you to create more than a thousand examples from just one problem and allows you to test if the LLMs actually understand the problem or whether it was just lucky pattern-matching.

Example of a symbolic template and generated questions

⏳ Installation

pip install multilingual-gsm-symbolic

👩‍💻 Get started

from multilingual_gsm_symbolic import load_data, available_languages

# see possible languages
print(available_languages())
# {'eng': {'number of samples': 100}, 'dan': {'number of samples': 100}}

# Load English templates
templates = load_data("eng")

# Generate concrete questions from a template
questions = templates[0].generate_questions(n=10, language="eng")

for q in questions:
    print(q.question)
    print(q.answer)
    print()

📋 Template format

Templates are JSON files with four fields:

Field Description
question Concrete question (the original example)
answer Concrete answer with calculation steps
question_annotated Template with variable placeholders and #init / #conditions / #answer sections
answer_annotated Answer template with inline expressions

Annotated question syntax

{variable, default_value}   — placeholder in the question text
#init:
- $var = range(low, high)   — variable sampled from a range
- $var = sample([a, b, c])  — variable sampled from a list
#conditions:
- is_int(x / y)             — constraint that must hold for a combination to be valid
#answer: x * y + z          — Python expression evaluated to produce the numeric answer
Example: fog bank problem
{
  "question": "A fog bank rolls in over a city at 3 miles/hour. The city is 42 miles wide. How many hours will it take for the fog bank to cover the city?",
  "question_annotated": "A fog bank rolls in over a city at {speed,3} miles/hour. The city is {width,42} miles wide. How many hours will it take for the fog bank to cover the city?\n#init:\n- $speed = range(1, 20)\n- $width = range(2, 100)\n#conditions:\n- is_int(width / speed)\n#answer: width // speed",
  "answer": "At 3 miles/hour, it will take 42/3=14 hours for the fog to cover the city.",
  "answer_annotated": "At {speed} miles/hour, it will take {width}/{speed}={width//speed} hours for the fog to cover the city."
}
Example: shopping problem
{
  "question": "A store sells apples for $2 each and oranges for $3 each. If you buy 4 apples and 5 oranges, how much do you spend?",
  "question_annotated": "A store sells apples for ${apple_price,2} each and oranges for ${orange_price,3} each. If you buy {n_apples,4} apples and {n_oranges,5} oranges, how much do you spend?\n#init:\n- $apple_price = range(1, 10)\n- $orange_price = range(1, 10)\n- $n_apples = range(1, 20)\n- $n_oranges = range(1, 20)\n#conditions:\n- True\n#answer: apple_price * n_apples + orange_price * n_oranges",
  "answer": "You spend 4*2 + 5*3 = 8 + 15 = $23.",
  "answer_annotated": "You spend {n_apples}*{apple_price} + {n_oranges}*{orange_price} = {n_apples*apple_price} + {n_oranges*orange_price} = ${apple_price*n_apples + orange_price*n_oranges}."
}

🗃️ Data

The English templates are derived from Apple's GSM-Symbolic paper. The Danish templates are manual translations and localizations of the English set, validated both computationally and manually. The original concrete problems are from GSM8k.

Language Code Templates
English eng 100
Danish dan 100

Writing a custom template

Here is a complete example — a "speed × time = distance" problem with randomised values and a divisibility constraint:

{
  "question": "A car travels at 60 mph for 3 hours. How far does it travel?",
  "answer": "Distance = speed × time = 60 × 3 = 180 miles.\n#### 180",
  "id_orig": 0,
  "id_shuffled": 0,
  "question_annotated": "A car travels at {speed,60} mph for {hours,3} hours. How far does it travel?\n#init:\n- $speed = range(20, 100, 10)\n- $hours = range(1, 9)\n#conditions:\n- is_int(speed * hours / 10)\n#answer: speed * hours",
  "answer_annotated": "Distance = speed × time = {speed} × {hours} = {speed * hours} miles.\n#### {speed * hours}"
}

Save it as a .json file and load it directly:

from multilingual_gsm_symbolic.gsm_parser import AnnotatedQuestion

template = AnnotatedQuestion.from_json("my_template.json")
questions = template.generate_questions(n=5, language="eng")
for q in questions:
    print(q.question)
    print(q.answer)

Init functions available in #init lines:

Function Returns
range(start, end[, step]) integers in [start, end)
arange(start, end[, step]) evenly-spaced floats
sample(items[, n]) one item (or n items) from a list
sample_sequential(items, n) n consecutive items from a list
range_str(start, end, step, word_list) (word, int) pairs, e.g. ("three", 3)

Condition functions available in #conditions lines:

Function Returns
is_int(x) True if x is a whole number
divides(a, b) True if a % b == 0
Fraction(x) fraction string, e.g. "3/4"

📖 API reference

function load_data

load_data(language="eng", directory=None)  list[AnnotatedQuestion]

Load symbolic templates.

Argument Type Description
language str Language code, e.g. "eng" (default) or "dan"
directory Path | None Override the bundled data; load templates from this path instead
RETURNS list[AnnotatedQuestion] The loaded templates

function load_replacements

load_replacements(language="eng")  dict

Load language-specific named values (e.g. lists of names, places) used inside templates.

Argument Type Description
language str Language code, e.g. "eng" (default)
RETURNS dict Mapping of replacement name → value list

function load_gsm

load_gsm(language="eng", directory=None)  list[GSMProblem]

Load the bundled concrete problems for a given language.

Argument Type Description
language str Language code, e.g. "eng" (default)
directory Path | None Override the bundled data directory
RETURNS list[GSMProblem] The loaded concrete problems

class AnnotatedQuestion

Core class representing a symbolic template. Constructed from a JSON template file via AnnotatedQuestion.from_json(path).

method AnnotatedQuestion.generate_questions

Generate concrete Question instances from the template.

Argument Type Description
n int Number of questions to generate
language str Language code for rendered text
replacements dict Replacement values from load_replacements
RETURNS list[Question] The generated questions

method AnnotatedQuestion.get_default_assignments

Extract the example variable values from the template.

Argument Type Description
replacements dict Replacement values from load_replacements
RETURNS dict Mapping of variable name → default value

method AnnotatedQuestion.format_question

Render the question text for a given variable assignment.

Argument Type Description
assignments dict Variable name → value mapping
language str Language code for rendered text
RETURNS str The rendered question string

method AnnotatedQuestion.format_answer

Render the answer text for a given variable assignment.

Argument Type Description
assignments dict Variable name → value mapping
language str Language code for rendered text
RETURNS str The rendered answer string

class Question

Dataclass holding a single generated problem.

Attribute Type Description
question str The rendered question text
answer str The rendered answer text
id_orig int Index of the original template
id_shuffled int Index within the shuffled sample

class GSMProblem

Pydantic model for a concrete problem loaded from disk.

Attribute Type Description
question str The question text
answer str The answer text
id_orig int Original problem index
filepath Path Path to the source file on disk

Acknowledgement

The symbolic template engine and the danish subset were originally developed as part of the m-gsm-symbolic project at the Centre for Humanities Computing by:

The initial template format was derived from Apple's GSM-Symbolic paper and the original concrete problems are from GSM8k.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

multilingual_gsm_symbolic-0.3.0.tar.gz (122.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

multilingual_gsm_symbolic-0.3.0-py3-none-any.whl (212.5 kB view details)

Uploaded Python 3

File details

Details for the file multilingual_gsm_symbolic-0.3.0.tar.gz.

File metadata

  • Download URL: multilingual_gsm_symbolic-0.3.0.tar.gz
  • Upload date:
  • Size: 122.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for multilingual_gsm_symbolic-0.3.0.tar.gz
Algorithm Hash digest
SHA256 40982fb6827c70d7357b49b910b1e6ac6f8c8ea669e6f93fe0b08e0ccf5550fd
MD5 ea5d362af2b2164242162ffe10de3a70
BLAKE2b-256 339cef34eab0ebf89aaff861799e965eb1c174608847567aba3581a8a91fb0e2

See more details on using hashes here.

File details

Details for the file multilingual_gsm_symbolic-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: multilingual_gsm_symbolic-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 212.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.26 {"installer":{"name":"uv","version":"0.9.26","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for multilingual_gsm_symbolic-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 37c205f6ba2aeca27b958c616c004572a7ad7088bccc4cdd89fa27be48d5e8b6
MD5 7846e19b47763480968f6e2304fb03ec
BLAKE2b-256 18d6689eeb07055c67faab93102f5086621e65f75687b9d9b8f1db306c3ef2ce

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page