Skip to main content

Lookup FOMA FSTs

Project description

FST Lookup

Build Status codecov PyPI version calver YYYY.MM.DD

Implements lookup for FOMA format finite state transducers.

Supports Python 3.5 and up.

Install

pip install fst-lookup

Usage

Import the library, and load an FST from a file:

Hint: Test this module by downloading the eat FST!

>>> from fst_lookup import FST
>>> fst = FST.from_file('eat.fomabin')

Assumed format of the FSTs

fst_lookup assumes that the lower label corresponds to the surface form, while the upper label corresponds to the lemma, and linguistic tags and features: e.g., your LEXC will look something like this---note what is on each side of the colon (:):

Multichar_Symbols +N +Sg +Pl
Lexicon Root
    cow+N+Sg:cow #;
    cow+N+Pl:cows #;
    goose+N+Sg:goose #;
    goose+N+Pl:geese #;
    sheep+N+Sg:sheep #;
    sheep+N+Pl:sheep #;

If your FST has labels on the opposite sides, you must invert the net before loading it into fst_lookup.

Analyze a word form

To analyze a form (take a word form, and get its linguistic analyzes) call the analyze() function:

def analyze(self, surface_form: str) -> Iterator[Analysis]

This will yield all possible linguistic analyses produced by the FST.

An analysis is a tuple of strings. The strings are either linguistic tags, or the lemma (base form of the word).

FST.analyze() is a generator, so you must call list() to get a list.

>>> list(sorted(fst.analyze('eats')))
[('eat', '+N', '+Mass'),
 ('eat', '+V', '+3P', '+Sg')]

Generate a word form

To generate a form (take a linguistic analysis, and get its concrete word forms), call the generate() function:

def generate(self, analysis: str) -> Iterator[str]

FST.generate() is a Python generator, so you must call list() to get a list.

>>> list(fst.generate('eat+V+Past')))
['ate']

License

Copyright © 2019 Eddie Antonio Santos. Released under the terms of the Apache license. See LICENSE for more info.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fst-lookup-2019.6.10.tar.gz (12.7 kB view hashes)

Uploaded Source

Built Distribution

fst_lookup-2019.6.10-py3-none-any.whl (17.7 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page