Skip to main content

A modified Porter stemmer for verbs and other additional rules.

Project description

What is stemming?

Stemming is a technique in Natural Language Processing that reduces various inflected forms of a word to a single invariant root form. This root form, known as the stem, may or may not be identical to the word's morphological root.

What is it good for?

Stemming is highly useful in various applications, with query expansion in information retrieval being a prime example. For instance, in a search engine, if a user searches for "cat," it would be beneficial for the search to return documents containing the word "cats" as well. This won't happen unless both the query and the document index undergo stemming. Essentially, stemming reduces the specificity of queries, enabling the retrieval of more relevant results, though this involves a trade-off.

What type of stemmer is this?

modifiedstemmer is a suffix-stripping stemmer, which means it transforms words into stems by applying a predetermined sequence of changes to the word's suffix. Other stemmers may function differently, such as by using a lookup table to map inflected forms to their roots or by employing clustering techniques to group various forms around a central form. Each approach comes with its own set of pros and cons. modifiedstemmer, specifically, is a modified version of the original Porter stemmer and includes more comprehensive rules for handling verbs and suffixes.

How do I use it?

Using the modifiedstemmer is straightforward. Simply import the stemmer, create an instance, and use it to stem words:

from modifiedstemmer import stemmer
stemmer = stemmer()
print(stemmer.stem('consistent'))

This process will convert the word 'consistent' to its stem form.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

modifiedstemmer-0.0.7.tar.gz (8.2 kB view details)

Uploaded Source

Built Distribution

modifiedstemmer-0.0.7-py3-none-any.whl (2.9 kB view details)

Uploaded Python 3

File details

Details for the file modifiedstemmer-0.0.7.tar.gz.

File metadata

  • Download URL: modifiedstemmer-0.0.7.tar.gz
  • Upload date:
  • Size: 8.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.13

File hashes

Hashes for modifiedstemmer-0.0.7.tar.gz
Algorithm Hash digest
SHA256 07a04fad3dcb73b12c99721d7da155525091f2200a1e8773a144873fd2f91a62
MD5 ab40b362747312cb5e777b72ba972093
BLAKE2b-256 b633bfb8280779424b591a38b0b58faa824eaadba3a5c7ee91a0578226348788

See more details on using hashes here.

Provenance

File details

Details for the file modifiedstemmer-0.0.7-py3-none-any.whl.

File metadata

File hashes

Hashes for modifiedstemmer-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 1e718428bbaf2b8af0ed86fc12402ff83f8d3646c25b27cae7ff9aed84e302a2
MD5 446685d88d39df9637c70b04797a4f41
BLAKE2b-256 ab021450985094ef2cc8b8cfa8705456224421100c96f078f8bfcc4a1d2a6825

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page