Skip to main content

A modified Porter stemmer for verbs and other additional rules.

Project description

What is stemming?

Stemming is a technique in Natural Language Processing that reduces various inflected forms of a word to a single invariant root form. This root form, known as the stem, may or may not be identical to the word's morphological root.

What is it good for?

Stemming is highly useful in various applications, with query expansion in information retrieval being a prime example. For instance, in a search engine, if a user searches for "cat," it would be beneficial for the search to return documents containing the word "cats" as well. This won't happen unless both the query and the document index undergo stemming. Essentially, stemming reduces the specificity of queries, enabling the retrieval of more relevant results, though this involves a trade-off.

What type of stemmer is this?

Porterstemmer_Modified is a suffix-stripping stemmer, which means it transforms words into stems by applying a predetermined sequence of changes to the word's suffix. Other stemmers may function differently, such as by using a lookup table to map inflected forms to their roots or by employing clustering techniques to group various forms around a central form. Each approach comes with its own set of pros and cons. Porterstemmer_Modified, specifically, is a modified version of the original Porter stemmer and includes more comprehensive rules for handling verbs and suffixes.

How do I use it?

Using the Porterstemmer_Modified is straightforward. Simply import the stemmer, create an instance, and use it to stem words:

from modifiedstemmer import Porterstemmer_Modified
stemmer = Porterstemmer_Modified()
print(stemmer.stem('consistent'))

This process will convert the word 'consistent' to its stem form.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

modifiedstemmer-0.0.6.tar.gz (8.2 kB view details)

Uploaded Source

Built Distribution

modifiedstemmer-0.0.6-py3-none-any.whl (2.9 kB view details)

Uploaded Python 3

File details

Details for the file modifiedstemmer-0.0.6.tar.gz.

File metadata

  • Download URL: modifiedstemmer-0.0.6.tar.gz
  • Upload date:
  • Size: 8.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.10.13

File hashes

Hashes for modifiedstemmer-0.0.6.tar.gz
Algorithm Hash digest
SHA256 1bea4906b50a69bc610c3bef0f1ddf5fb4f850a2723697ba7962c506aa0adfb9
MD5 abc7c16cc1d98a65697d8aec43d63e11
BLAKE2b-256 bc3620dc6dbd69835a21b0e293129acffafbd39026ee5795450c75ff5915c5fe

See more details on using hashes here.

Provenance

File details

Details for the file modifiedstemmer-0.0.6-py3-none-any.whl.

File metadata

File hashes

Hashes for modifiedstemmer-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 6af246046ba18d2170be17a93b31247d31f372f1379beaf6903c1fd01edff68c
MD5 c6953739869b3b7a90772cb91ff86464
BLAKE2b-256 85bbd553559e511fb346526fcb4469a8a878a5385e8bb29f70739cc15cc3c159

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page