Skip to main content

It takes an sentence as an input and augment it

Project description

nlp-augment

It takes an sentence as an input and augment it based on sense, pos tag synonym, and synonyms.

Installation

pip install nlp-augment

For jupyter Notebook !pip install nlp-augment

How to use it?

import augment

Method 1: We applied word-sense disambiguation to each word of the input sentence, after the preprocessing stagethat removes stopwords and other uncommon characters. The synonymy relation was used to extract the list of sensesfor each word. Next, to find out which of these senses better fit the context of the sentence, Lesk’s algorithm wasemployed. The original version of this algorithm disambiguates words in short sentences. For that, the gloss ofthe word to disambiguate (dictionary of its senses) is compared to glosses of other words of the sentence. Then, thesense that shares the most significant number of common words with the glosses of other words of the phrase is chosenand assigned to the target word.

Exmaple of hate sentence aumentation

augment.sense('you are gay')

['you are gay', 'you are queer', 'you are homophile']

Method 2: We apply a Part of Speech (PoS) Tagging to each sentence, which is later used to extract all meanings(synsets) and synonyms that correspond to that word #PoS combination. This approach could widely expand thesemantic space over the previously mentioned data augmentation approach (method 1), as one word could have multiplemeanings of the same part of speech.

>>> augment.pos('you are gay')

['you are gay', 'you are cheery', 'you are sunny', 'you are jocund', 'you are jolly', 'you are jovial', 'you are merry', 'you are mirthful', 'you are brave', 'you are braw', 'you are festal', 'you are festive', 'you are queer', 'you are homophile']

Method 3: We extract all possible meanings (synsets) of every complete word (after preprocessing), and then weretrieve the synonyms associated with every possible meaning. This significantly expands the semantic space of eachsentence compared to the first two methods. We are considering here all possible meanings (including every PoS thatthis word may belong to) as well as the similar words of each meaning regardless of the coherence of the correspondingcontext.

augment.synonym('you are gay')

['you are gay', 'you ar gay', 'you be gay', 'you exist gay', 'you equal gay', 'you constitute gay', 'you represent gay', 'you make up gay', 'you comprise gay', 'you follow gay', 'you embody gay', 'you personify gay', 'you live gay', 'you cost gay', 'you are homosexual', 'you are homophile', 'you are homo', 'you are cheery', 'you are sunny', 'you are jocund', 'you are jolly', 'you are jovial', 'you are merry', 'you are mirthful', 'you are brave', 'you are braw', 'you are festal', 'you are festive', 'you are queer']

License

© 2021 Md Saroar Jahan

This repository is licensed under the MIT license. See LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nlp-augment-1.0.9.tar.gz (3.3 kB view details)

Uploaded Source

Built Distribution

nlp_augment-1.0.9-py3-none-any.whl (4.9 kB view details)

Uploaded Python 3

File details

Details for the file nlp-augment-1.0.9.tar.gz.

File metadata

  • Download URL: nlp-augment-1.0.9.tar.gz
  • Upload date:
  • Size: 3.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.7.6

File hashes

Hashes for nlp-augment-1.0.9.tar.gz
Algorithm Hash digest
SHA256 048951f173c27f107250922b2c965591cccba1525c77c65f19a80052785ba374
MD5 c695f5baa45cdf12c50064da31d5ac69
BLAKE2b-256 c4b8042e5fe4238ed61340b5d687d5ae16e70384814b9aa2caf112aac89a99d6

See more details on using hashes here.

File details

Details for the file nlp_augment-1.0.9-py3-none-any.whl.

File metadata

  • Download URL: nlp_augment-1.0.9-py3-none-any.whl
  • Upload date:
  • Size: 4.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.7.6

File hashes

Hashes for nlp_augment-1.0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 071219acf6575826d7465bbc4470079eee08eb5ace2df2e8faa78ce50ba01daf
MD5 fde918e33b8d933a05af4ccfd56a1dd5
BLAKE2b-256 ee55d206ea4f1831480e6d6511723f3f6a72be56adf778aba8c55affaac9cfff

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page