A NLP package for Portuguese Lemmatization.
Project description
This NLP package for Portuguese lemmatization is a powerful and advanced tool that can accurately transform words into their base forms or lemmas, taking into account the specific grammatical rules and variations of the Portuguese language. It is designed to handle various types of text input and supports multiple output formats, making it a versatile tool for applications such as information retrieval, machine translation, sentiment analysis, and text classification. Additionally, the package is customizable and user-friendly, allowing users to specify their own dictionaries and rules for lemmatization and providing features for error correction and word sense disambiguation. Whether you are a researcher, developer, or linguist working with Portuguese text data, this NLP package can help you save time and improve the accuracy and quality of your analyses. With its advanced algorithms and techniques in NLP, you can trust that this tool will provide high-quality results and make the lemmatization process more efficient.
A lemma is a word that stands at the head of a definition in a dictionary. Wikipedia
Example
from pt_lemmatizer.lemma import Lemmatizer
l = Lemmatizer()
l.lemmatize('apagou') #all words must be unidecoded and lowercased
>> 'apagar'
l.lemmatize('nasalaram')
>> 'nasalar'
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for pt_lemmatizer-1.1.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 514d6b25ebe4246bc43b31f08d0b807ac826998873e01f6f05139f42d719121c |
|
MD5 | 1ed04049e87fc341cb3f01cc82f56714 |
|
BLAKE2b-256 | 47276e248ea1b54c2549859db20387f24eee137035592b06d124da97e39148d8 |