No project description provided
Project description
GenLM Grammar
A Python library for working with weighted context-free grammars (WCFGs), weighted finite state automata (WFSAs) and weighted finite state transducers (WFSTs). The library provides efficient implementations for grammar operations, parsing algorithms, and language model functionality.
Key Features
Grammar Operations
- Support for weighted context-free grammars with various semirings (Boolean, Float, Real, MaxPlus, MaxTimes, etc.)
- Grammar transformations:
- Local normalization
- Removal of nullary rules and unary cycles
- Grammar binarization
- Length truncation
- Renaming/renumbering of nonterminals
Parsing Algorithms
- Earley parsing (O(n³|G|) complexity)
- Standard implementation
- Rescaled version for numerical stability
- CKY parsing
- Incremental CKY with chart caching
- Support for prefix computations
Language Model Interface
BoolCFGLM: Boolean-weighted CFG language modelCKYLM: Probabilistic CFG language model using CKYEarleyLM: Language model using Earley parsing
Finite State Automata
- Weighted FSA implementation
- Operations:
- Epsilon removal
- Minimization (Brzozowski's algorithm)
- Determinization
- Composition
- Reversal
- Kleene star/plus
Additional Features
- Semiring abstractions (Boolean, Float, Log, Entropy, etc.)
- Efficient chart and agenda-based algorithms
- Grammar-FST composition
- Visualization support via Graphviz
Quick Start
Installation
Clone the repository:
git clone git@github.com:chisym/genlm-grammar.git
cd genlm-grammar
and install with pip:
pip install .
This installs the package without development dependencies. For development, install in editable mode with:
pip install -e ".[test,docs]"
which also installs the dependencies needed for testing (test) and documentation (docs).
Requirements
- Python >= 3.10
- The core dependencies listed in the
setup.pyfile of the repository.
Testing
When test dependencies are installed, the test suite can be run via:
pytest tests
Documentation
Documentation is generated using mkdocs and hosted on GitHub Pages. To build the documentation, run:
mkdocs build
To serve the documentation locally, run:
mkdocs serve
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file genlm_grammar-0.1.0.tar.gz.
File metadata
- Download URL: genlm_grammar-0.1.0.tar.gz
- Upload date:
- Size: 63.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
83a81d484826c598afdeb3f18274cefe4bef198d08ee4ca7ef599412887127f9
|
|
| MD5 |
2bee32219331494beb1b6ef028453294
|
|
| BLAKE2b-256 |
5b4c32f0522d0de663586cb1305e99745dbfd6fbdecc7de39d6db52222244811
|
File details
Details for the file genlm_grammar-0.1.0-py3-none-any.whl.
File metadata
- Download URL: genlm_grammar-0.1.0-py3-none-any.whl
- Upload date:
- Size: 74.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f3179893c94d3207c56ac5640a6a6e92265ac515a1ec8f6cc5b1ed93342a99bd
|
|
| MD5 |
b5bd771734229d5e8fe6c8f8ef819984
|
|
| BLAKE2b-256 |
fc74e6db0722cc778fa61abe728654262cdfa92d0467f45cf2e3197baa5b0723
|