Skip to main content

No project description provided

Project description

genlm_repo_logos 003

Docs Tests codecov

A Python library for working with weighted context-free grammars (WCFGs), weighted finite state automata (WFSAs) and weighted finite state transducers (WFSTs). The library provides efficient implementations for grammar operations, parsing algorithms, and language model functionality.

Quick Start

This library can be installed via pip:

pip install genlm-grammar

Key Features

Grammar Operations

  • Support for weighted context-free grammars with various semirings (Boolean, Float, Real, MaxPlus, MaxTimes, etc.)
  • Grammar transformations:
    • Local normalization
    • Removal of nullary rules and unary cycles
    • Grammar binarization
    • Length truncation
    • Renaming/renumbering of nonterminals

Parsing Algorithms

  • Earley parsing (O(n³|G|) complexity)
    • Standard implementation
    • Rescaled version for numerical stability
  • CKY parsing
    • Incremental CKY with chart caching
    • Support for prefix computations

Language Model Interface

  • BoolCFGLM: Boolean-weighted CFG language model
  • CKYLM: Probabilistic CFG language model using CKY
  • EarleyLM: Language model using Earley parsing

Finite State Automata

  • Weighted FSA implementation
  • Operations:
    • Epsilon removal
    • Minimization (Brzozowski's algorithm)
    • Determinization
    • Composition
    • Reversal
    • Kleene star/plus

Additional Features

  • Semiring abstractions (Boolean, Float, Log, Entropy, etc.)
  • Efficient chart and agenda-based algorithms
  • Grammar-FST composition
  • Visualization support via Graphviz

Development

See DEVELOPING.md for information on how to install the package in development mode.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

genlm_grammar-0.2.0a0.tar.gz (63.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

genlm_grammar-0.2.0a0-py3-none-any.whl (54.1 kB view details)

Uploaded Python 3

File details

Details for the file genlm_grammar-0.2.0a0.tar.gz.

File metadata

  • Download URL: genlm_grammar-0.2.0a0.tar.gz
  • Upload date:
  • Size: 63.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for genlm_grammar-0.2.0a0.tar.gz
Algorithm Hash digest
SHA256 fd4e65e5798575457befc84069a90505b01368663793d52b561aa8a29b0140bd
MD5 37b8762dc6bdf2f14b931f2de786ff6b
BLAKE2b-256 2e267b93fdeb11d2716038521bda07e0748001ca48a6853bffea10e88f9573cc

See more details on using hashes here.

File details

Details for the file genlm_grammar-0.2.0a0-py3-none-any.whl.

File metadata

File hashes

Hashes for genlm_grammar-0.2.0a0-py3-none-any.whl
Algorithm Hash digest
SHA256 3f41cbf52b1ab9b91e2e03d0a67dc1cc301a227e725e8faedfb3bdfce6ca9214
MD5 a0c93466f2e5158209cf1fa00671bd51
BLAKE2b-256 38c107e999829ea5907e85c757a7616d52cd5712e5c0b344d8bb8f9d1fd862a1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page