Skip to main content

A NLP package for Portuguese Lemmatization.

Project description

This NLP package for Portuguese lemmatization is a powerful and advanced tool that can accurately transform words into their base forms or lemmas, taking into account the specific grammatical rules and variations of the Portuguese language. It is designed to handle various types of text input and supports multiple output formats, making it a versatile tool for applications such as information retrieval, machine translation, sentiment analysis, and text classification. Additionally, the package is customizable and user-friendly, allowing users to specify their own dictionaries and rules for lemmatization and providing features for error correction and word sense disambiguation. Whether you are a researcher, developer, or linguist working with Portuguese text data, this NLP package can help you save time and improve the accuracy and quality of your analyses. With its advanced algorithms and techniques in NLP, you can trust that this tool will provide high-quality results and make the lemmatization process more efficient.

A lemma is a word that stands at the head of a definition in a dictionary. Wikipedia

Example

from pt_lemmatizer import Lemmatizer

l = Lemmatizer()
l.lemmatize('apagou')  #all words must be unidecoded and lowercased
>> 'apagar'
l.lemmatize('nasalaram')
>> 'nasalar'



Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pt_lemmatizer-2.1.18.tar.gz (2.4 MB view hashes)

Uploaded Source

Built Distribution

pt_lemmatizer-2.1.18-py3-none-any.whl (2.5 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page