Skip to main content

A python module for word inflections designed for use with Spacy.

Project description

pyinflect

A python module for word inflections that works as a Spacy extension

This module is designed as an extension for Spacy and will return the the inflected form of a word based on a supplied Penn Treekbank part-of-speech tag. It can also be used a stanalone module outside of Spacy.

It is based on the Automatically Generated Inflection Database (AGID). The AGID data provides a list of inflections for various word lemma.

See the scripts directory for examples and tests of the system or the tests directory for unit test examples.

Installation

pip3 install pyinflect

Usage as an Extension to Spacy

To use as an extension to Spacy, first import the module. This will create a new inflect method for each Spacy Token that takes in a Penn Treebank tag as its parameter. The method returns the inflected form of the token's lemma based on the supplied treekbank tag. When more than one spelling exists for the inflection, only the first one is returned.

> import pyinflect
> doc = nlp('My example.')
> doc[1]._.inflect('NNS')
examples

Usage Standalone

To use standalone, import the method getAllInflections or getInflection and call it directly. getAllInflections returns all entries in the infl.csv file as a list of inflected forms, where each form entry is a tuple with one or more spellings. getInflection returns only the form that corresponds to the given treebank tag.

> from pyinflect import getAllInflections, getInflection
> getAllInflections('be', 'V')
[('was', 'wast'), ('were',), ('been',), ('being',), ('am',), ('are', 'art'), ('is',), ('are',)]

> getInflection('be', 'VBD')
('were',)

Known Issues:

See KnownIssues.txt for more specifics.

  • Forms of the verb "be" are not completely specified by the treekbank tag. When the inflected form is ambiguous the first person form is returned. Setting a flag to the method allows returning the 2nd person version of the inflection. This only applies to the "was"/"were" and "am"/"are" forms of "be".
  • The AGID data is created by a 3rd party and not maintained here. Some lemmas are not in that data file, infl.csv, and thus can not be inflected.
  • In some cases the AGID may not contain the best inflection of the word. For instance, lemma "people" with tag "NNS" will return "peoples" where you may want the word "people" which is also plural. There is an existing "overrides.csv" file which these can be added to if needed.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyinflect-0.2.0.tar.gz (7.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyinflect-0.2.0-py3-none-any.whl (700.5 kB view details)

Uploaded Python 3

File details

Details for the file pyinflect-0.2.0.tar.gz.

File metadata

  • Download URL: pyinflect-0.2.0.tar.gz
  • Upload date:
  • Size: 7.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.20.1 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.31.1 CPython/3.6.7

File hashes

Hashes for pyinflect-0.2.0.tar.gz
Algorithm Hash digest
SHA256 f74e5423eaac23c5fbd4aa31f2e88868453810d40f7098ab53c30cb513004ea8
MD5 dc6490ef8c23331a57edcf343873620c
BLAKE2b-256 e8c2af6e075fe8caf6ebabd9740493e24325281fa6786024a3b35ccb321650b9

See more details on using hashes here.

File details

Details for the file pyinflect-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: pyinflect-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 700.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.20.1 setuptools/40.6.2 requests-toolbelt/0.8.0 tqdm/4.31.1 CPython/3.6.7

File hashes

Hashes for pyinflect-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 5d56dbdb624efd591448f75f9ae412678bfb471f29e590d18002c9cc343d0a33
MD5 5ec172d83aa562884b773dfcfbcc9c58
BLAKE2b-256 47dbbf1f878545bbd23cd65886b80c8bef0172d59f991b6e1aff7cc5e349fdac

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page