Skip to main content

A modified Porter stemmer for verbs and other additional rules.

Project description

What is stemming?

Stemming is a technique in Natural Language Processing that reduces various inflected forms of a word to a single invariant root form. This root form, known as the stem, may or may not be identical to the word's morphological root.

What is it good for?

Stemming is highly useful in various applications, with query expansion in information retrieval being a prime example. For instance, in a search engine, if a user searches for "cat," it would be beneficial for the search to return documents containing the word "cats" as well. This won't happen unless both the query and the document index undergo stemming. Essentially, stemming reduces the specificity of queries, enabling the retrieval of more relevant results, though this involves a trade-off.

What type of stemmer is this?

Porterstemmer_Modified is a suffix-stripping stemmer, which means it transforms words into stems by applying a predetermined sequence of changes to the word's suffix. Other stemmers may function differently, such as by using a lookup table to map inflected forms to their roots or by employing clustering techniques to group various forms around a central form. Each approach comes with its own set of pros and cons. Porterstemmer_Modified, specifically, is a modified version of the original Porter stemmer and includes more comprehensive rules for handling verbs and suffixes.

How do I use it?

Using the Porterstemmer_Modified is straightforward. Simply import the stemmer, create an instance, and use it to stem words:

from Porterstemmer_Modified import Porterstemmer_Modified
stemmer = Porterstemmer_Modified()
print(stemmer.stem('consistent'))

This process will convert the word 'consistent' to its stem form.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

modified_porterstemmer-0.0.3.tar.gz (8.2 kB view details)

Uploaded Source

Built Distribution

modified_porterstemmer-0.0.3-py3-none-any.whl (3.0 kB view details)

Uploaded Python 3

File details

Details for the file modified_porterstemmer-0.0.3.tar.gz.

File metadata

File hashes

Hashes for modified_porterstemmer-0.0.3.tar.gz
Algorithm Hash digest
SHA256 2d1ebf301caa13d97d3bc6bf8298553d1a018adeae42a125b8c2ebb615d24fdb
MD5 05890169f58496e165ab327d09bb6876
BLAKE2b-256 d2344f02ef2088535318717d133f17ae993411c7cffe46baeebf473e51147127

See more details on using hashes here.

File details

Details for the file modified_porterstemmer-0.0.3-py3-none-any.whl.

File metadata

File hashes

Hashes for modified_porterstemmer-0.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 e094a96a095b4394a4881c46e224a4ba8f9d820192dbc4e92f6a6b8a59b8ac71
MD5 d203d52ae23f87e1bbe088e0090e068f
BLAKE2b-256 be3d18fde9b39e2de8d24b5bd4319d933d323558e94579637288ac3799bdb831

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page