Skip to main content

No project description provided

Project description

gramforge

Gramforge (formerly unigram) is a library for random (depth first) generation with context-sensitive grammars (but also context free grammars) for synthetic data creation.

One particularity is the option to generate in multiple languages in parallel (for example, tptp and pseudo-english).

Example with LogicNLI grammar:

pip install gramforge

from gramforge import init_grammar, generate
def LogicNLI():
    ADJECTIVES = ['rich', 'quiet', 'old', 'tall', 'kind', 'brave', 'wise',
                  'happy', 'strong', 'curious', 'patient', 'funny', 'generous', 'humble']
    # (We selected adjectives with no clear semantic interference)
    NAMES = ['mary', 'paul', 'fred', 'alice', 'john', 'susan', 'lucy']

    R = init_grammar(['tptp','eng'])
    R('start(' + ','.join(['rule']*16) + ',' + ','.join(['fact']*8) + ')',
      '&\n'.join([f'({i})' for i in range(24)]),
      '\n'.join([f'{i}' for i in range(24)]))

    R('hypothesis(person,a)', '1(0)', '0 is 1')
    for a in ADJECTIVES:
        R('adj', a)
        R('adj', f'~{a}', f'not {a}', weight=0.2)

    R('property(adj,adj)', '(0(?)&1(?))', 'both 0 and 1')
    R('property(adj,adj)', '(0(?)|1(?))', '0 or 1')
    R('property(adj,adj)', '(0(?)<~>1(?))', 'either 0 or 1', weight=0.5)
    R('property(adj)', '0(?)', '0')

    R('rule(property,property)', '![X]:(0[?←X]=>1[?←X])',
      'everyone who is 0 is 1')
    R('rule(property,property)', '![X]:(0[?←X]<=>1[?←X])',
      'everyone who is 0 is 1 and vice versa')

    for p in NAMES:
        R('person', p)

    R('fact(person,property)', '1[?←0]', '0 is 1')
    R('fact(property)', '?[X]:(0[?←X])', 'someone is 0', weight=0.2)
    R('rule(fact,fact)', '(0)=>(1)', 'if 0 then 1')
    R('rule(fact,fact)', '(0)<=>(1)', 'if 0 then 1 and vice versa')
    return R


eng, tptp = "eng","tptp"
grammar = LogicNLI()
x=generate(grammar)
print(x@eng)
print(x@tptp)

Pre-loaded grammars

We feature pre-written grammars including:

  • tinypy_grammar, reproducing the tinypy, a synthetic toy grammar of python for LLM training/evaluation
  • FOL_grammar, a sophisticated controlled grammar for first order logic aligned with simplified English
  • arith_grammar (a simple grammar for arithmeics)
  • regex_grammar, a grammar generating regular expressions

Example:

from gramforge.grammars import FOL_grammar, tinypy_grammar
from gramforge import generate
g=tinypy_grammar()
x=generate(g)
print(x@'py')

Migration from unigram

If you are upgrading from the unigram package, simply replace your imports:

# Before
from unigram import init_grammar, generate
# After
from gramforge import init_grammar, generate

The pip install unigram package will continue to work and re-export everything from gramforge with a deprecation warning.

Citation for the gramforge framework:

@inproceedings{sileo-2024-scaling,
    title = "Scaling Synthetic Logical Reasoning Datasets with Context-Sensitive Declarative Grammars",
    author = "Sileo, Damien",
    booktitle = "Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing",
    month = nov,
    year = "2024",
    address = "Miami, Florida, USA",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.emnlp-main.301/",
    doi = "10.18653/v1/2024.emnlp-main.301",
    pages = "5275--5283",
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gramforge-1.0.0.tar.gz (30.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

gramforge-1.0.0-py3-none-any.whl (34.2 kB view details)

Uploaded Python 3

File details

Details for the file gramforge-1.0.0.tar.gz.

File metadata

  • Download URL: gramforge-1.0.0.tar.gz
  • Upload date:
  • Size: 30.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.10

File hashes

Hashes for gramforge-1.0.0.tar.gz
Algorithm Hash digest
SHA256 d8bc9336a651580aeaed3502449fb6b55017df810b56fd1a53d27a0d70f55db6
MD5 2027d8d35dda2241de8825917c542e4b
BLAKE2b-256 a3e9706056dcd963d9b52e8d9cd80f437b9d2ff67788008719b79512b5090777

See more details on using hashes here.

File details

Details for the file gramforge-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: gramforge-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 34.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.10

File hashes

Hashes for gramforge-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 be4e5bab1c60c29add27c2e536470b3919297f5ffabcaaee98cc7b8d9ca9e99a
MD5 a8719f39f233c7d04471b83daf636285
BLAKE2b-256 388c3fe2567c3492b82d0e4e6a70c01c920f91b12cfee96cc42a751d6397dda4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page