Skip to main content

A modified Porter stemmer for verbs and other additional rules.

Project description

What is stemming?

Stemming is a technique in Natural Language Processing that reduces various inflected forms of a word to a single invariant root form. This root form, known as the stem, may or may not be identical to the word's morphological root.

What is it good for?

Stemming is highly useful in various applications, with query expansion in information retrieval being a prime example. For instance, in a search engine, if a user searches for "cat," it would be beneficial for the search to return documents containing the word "cats" as well. This won't happen unless both the query and the document index undergo stemming. Essentially, stemming reduces the specificity of queries, enabling the retrieval of more relevant results, though this involves a trade-off.

What type of stemmer is this?

Porterstemmer_Modified is a suffix-stripping stemmer, which means it transforms words into stems by applying a predetermined sequence of changes to the word's suffix. Other stemmers may function differently, such as by using a lookup table to map inflected forms to their roots or by employing clustering techniques to group various forms around a central form. Each approach comes with its own set of pros and cons. Porterstemmer_Modified, specifically, is a modified version of the original Porter stemmer and includes more comprehensive rules for handling verbs and suffixes.

How do I use it?

Using the Porterstemmer_Modified is straightforward. Simply import the stemmer, create an instance, and use it to stem words:

from Porterstemmer_Modified import Porterstemmer_Modified
stemmer = Porterstemmer_Modified()
print(stemmer.stem('consistent'))

This process will convert the word 'consistent' to its stem form.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

modifiedstemmer-0.0.2.tar.gz (8.2 kB view details)

Uploaded Source

Built Distribution

modifiedstemmer-0.0.2-py3-none-any.whl (2.9 kB view details)

Uploaded Python 3

File details

Details for the file modifiedstemmer-0.0.2.tar.gz.

File metadata

  • Download URL: modifiedstemmer-0.0.2.tar.gz
  • Upload date:
  • Size: 8.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.0 CPython/3.10.13

File hashes

Hashes for modifiedstemmer-0.0.2.tar.gz
Algorithm Hash digest
SHA256 821838e31b7f6b2c16db0eb05e5a694b6fb2f063e9bde5a3f43032dbc83534fc
MD5 7bb0e8914a341561b944735d113d9d60
BLAKE2b-256 440537b07db1a7dd77e981838d9666208f1cd8f548ec7039ce2468cef7f592be

See more details on using hashes here.

Provenance

File details

Details for the file modifiedstemmer-0.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for modifiedstemmer-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 2359a29dab9fdaec712b08f1c882d4da4e86c25b39f719efbff82b4f2cb7b80f
MD5 26fb16714a05605c4154d2fe926c2d31
BLAKE2b-256 de74c7c0db303ce6f4f224a401b5f6d06f37331e7f3ff8abeeca1c9510daadd6

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page