Skip to main content

No project description provided

Project description

Docs Tests codecov

GenLM Grammar

A Python library for working with weighted context-free grammars (WCFGs), weighted finite state automata (WFSAs) and weighted finite state transducers (WFSTs). The library provides efficient implementations for grammar operations, parsing algorithms, and language model functionality.

Key Features

Grammar Operations

  • Support for weighted context-free grammars with various semirings (Boolean, Float, Real, MaxPlus, MaxTimes, etc.)
  • Grammar transformations:
    • Local normalization
    • Removal of nullary rules and unary cycles
    • Grammar binarization
    • Length truncation
    • Renaming/renumbering of nonterminals

Parsing Algorithms

  • Earley parsing (O(n³|G|) complexity)
    • Standard implementation
    • Rescaled version for numerical stability
  • CKY parsing
    • Incremental CKY with chart caching
    • Support for prefix computations

Language Model Interface

  • BoolCFGLM: Boolean-weighted CFG language model
  • CKYLM: Probabilistic CFG language model using CKY
  • EarleyLM: Language model using Earley parsing

Finite State Automata

  • Weighted FSA implementation
  • Operations:
    • Epsilon removal
    • Minimization (Brzozowski's algorithm)
    • Determinization
    • Composition
    • Reversal
    • Kleene star/plus

Additional Features

  • Semiring abstractions (Boolean, Float, Log, Entropy, etc.)
  • Efficient chart and agenda-based algorithms
  • Grammar-FST composition
  • Visualization support via Graphviz

Quick Start

Installation

Clone the repository:

git clone git@github.com:chisym/genlm-grammar.git
cd genlm-grammar

and install with pip:

pip install .

This installs the package without development dependencies. For development, install in editable mode with:

pip install -e ".[test,docs]"

which also installs the dependencies needed for testing (test) and documentation (docs).

Requirements

  • Python >= 3.10
  • The core dependencies listed in the setup.py file of the repository.

Testing

When test dependencies are installed, the test suite can be run via:

pytest tests

Documentation

Documentation is generated using mkdocs and hosted on GitHub Pages. To build the documentation, run:

mkdocs build

To serve the documentation locally, run:

mkdocs serve

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

genlm_grammar-0.1.0.tar.gz (63.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

genlm_grammar-0.1.0-py3-none-any.whl (74.2 kB view details)

Uploaded Python 3

File details

Details for the file genlm_grammar-0.1.0.tar.gz.

File metadata

  • Download URL: genlm_grammar-0.1.0.tar.gz
  • Upload date:
  • Size: 63.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for genlm_grammar-0.1.0.tar.gz
Algorithm Hash digest
SHA256 83a81d484826c598afdeb3f18274cefe4bef198d08ee4ca7ef599412887127f9
MD5 2bee32219331494beb1b6ef028453294
BLAKE2b-256 5b4c32f0522d0de663586cb1305e99745dbfd6fbdecc7de39d6db52222244811

See more details on using hashes here.

File details

Details for the file genlm_grammar-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: genlm_grammar-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 74.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for genlm_grammar-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f3179893c94d3207c56ac5640a6a6e92265ac515a1ec8f6cc5b1ed93342a99bd
MD5 b5bd771734229d5e8fe6c8f8ef819984
BLAKE2b-256 fc74e6db0722cc778fa61abe728654262cdfa92d0467f45cf2e3197baa5b0723

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page