Skip to main content

No project description provided

Project description

Logo

Docs Tests codecov PyPI

A Python library for working with weighted context-free grammars (WCFGs), weighted finite state automata (WFSAs) and weighted finite state transducers (WFSTs). The library provides efficient implementations for grammar operations, parsing algorithms, and language model functionality.

Quick Start

This library can be installed via pip:

pip install genlm-grammar

Key Features

Grammar Operations

  • Support for weighted context-free grammars with various semirings (Boolean, Float, Real, MaxPlus, MaxTimes, etc.)
  • Grammar transformations:
    • Local normalization
    • Removal of nullary rules and unary cycles
    • Grammar binarization
    • Length truncation
    • Renaming/renumbering of nonterminals

Parsing Algorithms

  • Earley parsing (O(n³|G|) complexity)
    • Standard implementation
    • Rescaled version for numerical stability
  • CKY parsing
    • Incremental CKY with chart caching
    • Support for prefix computations

Language Model Interface

  • BoolCFGLM: Boolean-weighted CFG language model
  • CKYLM: Probabilistic CFG language model using CKY
  • EarleyLM: Language model using Earley parsing

Finite State Automata

  • Weighted FSA implementation
  • Operations:
    • Epsilon removal
    • Minimization (Brzozowski's algorithm)
    • Determinization
    • Composition
    • Reversal
    • Kleene star/plus

Additional Features

  • Semiring abstractions (Boolean, Float, Log, Entropy, etc.)
  • Efficient chart and agenda-based algorithms
  • Grammar-FST composition
  • Visualization support via Graphviz

Development

See DEVELOPING.md for information on how to install the package in development mode.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

genlm_grammar-0.2.0.tar.gz (1.9 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

genlm_grammar-0.2.0-py3-none-any.whl (53.8 kB view details)

Uploaded Python 3

File details

Details for the file genlm_grammar-0.2.0.tar.gz.

File metadata

  • Download URL: genlm_grammar-0.2.0.tar.gz
  • Upload date:
  • Size: 1.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for genlm_grammar-0.2.0.tar.gz
Algorithm Hash digest
SHA256 b589fd7416c1068fc4c8eca5ccc5b63ea280bfd107ab8f5a1cb8af2846e94e14
MD5 770b2065294931f9b6c6a17697ff8219
BLAKE2b-256 2a80cf0fa1578966848198d87c243b2ce2ad4898baf0fbd5a1960ba846e09afd

See more details on using hashes here.

File details

Details for the file genlm_grammar-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: genlm_grammar-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 53.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for genlm_grammar-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 393524396b454f7ef70d1f854111c62782689cd43abff2056b92463069925062
MD5 a38b86e0777ac0ea50a283dcfcfcebea
BLAKE2b-256 02fbcf71d9e738396f16fb9e565cf72574c12e6e1ad5973a6fa5eb934a7dd00c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page