Skip to main content

Word Sense Disambiguation wrapper

Project description

Word Sense Disambiguation wrapper

In natural language processing word sense disambiguation (WSD) is the problem of determining which "sense" (meaning) of a word is activated by the use of the word in a particular context, a process which appears to be largely unconscious in people.

This is a simple library that wrap two WSD methods: NLTK and Babelfy.

Requirements

You should run

pip3 install xmltodict
pip3 install nltk
pip3 install pywsd

The NLTK library requires more extra configurations, see this link to more details.

Methods

The wsdNLTK methods call the function pywsd.disambiguate which returns a mapping between words of the input text and their WornNet Synsets.

wsd = WrapperWSD()
wsd.wsdNLTK(u'My sister has a dog. She loves him.')
#output: [('sister', Synset('sister.n.02'), 3, 9), ('dog', Synset('pawl.n.01'), 16, 19), ('loves', Synset('sleep_together.v.01'), 25, 30)]

Instead of returning the WornNet Synsets, the method wsdNLTK_offset returns a mapping between words of the input text and their WornNet offset.

wsd.wsdNLTK_offset(u'My sister has a dog. She loves him.')
#output: [('president', 597265, 21, 30), ('USA', 8394922, 38, 41), ('best', 67379, 54, 58)]

A mapping between WordNet and Wikipedia was proposed in [Miller et al] available for download here. In the next example you can see some key-values of it.

wd2wiki = {
 1740: 'https://en.wikipedia.org/wiki/Madison_Square_Garden,_L.P.',
 2137: 'https://en.wikipedia.org/wiki/Abstraction',
 2452: 'https://en.wikipedia.org/wiki/Object_(philosophy)',
 2684: 'https://en.wikipedia.org/wiki/Computer_file',
 3553: 'https://en.wikipedia.org/wiki/Unit_of_alcohol',
 ...
 }

We used this mapping to link entities from Wikipedia for those cases where exists a correspondence.

wsd.wsdNLTK_links(u'My sister has a dog. She loves him.')
#output: [{'start': 38, 'end': 41, 'label': 'USA', 'link': 'United_States_Army'}]

On the other hand, we include Babelfy targetting BabelSynsets

wsd.wsdBabelfy(u'My sister has a dog. She loves him.')
#output: [('sister', 'bn:00071838n', 3, 9), ('dog', 'bn:00015267n', 16, 19), ('loves', 'bn:00090504v', 25, 30)]

Reference

[Miller et al] WordNet–Wikipedia–Wiktionary: Construction of a Three-way Alignment. Tristan Miller and Iryna Gurevych. 2014 https://pdfs.semanticscholar.org/90cd/22a9cd59dc1fc21f4ec36e9c7d95085f7fb6.pdf

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

wrapperWSD-0.0.2.tar.gz (473.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

wrapperWSD-0.0.2-py3-none-any.whl (479.1 kB view details)

Uploaded Python 3

File details

Details for the file wrapperWSD-0.0.2.tar.gz.

File metadata

  • Download URL: wrapperWSD-0.0.2.tar.gz
  • Upload date:
  • Size: 473.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.8

File hashes

Hashes for wrapperWSD-0.0.2.tar.gz
Algorithm Hash digest
SHA256 fca180a1c7535a427323424cf80ffbaedebfc2344390f0771170dc27ab4266a1
MD5 8630313b0515efb07a08b6f7b3066778
BLAKE2b-256 553a7aa99660eba7da760e3c5f5c17a2d064fa09fa9f67f2480ce52cfd2d677c

See more details on using hashes here.

File details

Details for the file wrapperWSD-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: wrapperWSD-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 479.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.4.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.8

File hashes

Hashes for wrapperWSD-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 2be82c710270bd7f3d3bf1ea40f1121bc2f70f4d893b49311e22b804fc91a803
MD5 32c40491e3cada046679ed538a8be4b3
BLAKE2b-256 3cd4e8cef683200d0bf6a21bf6d76cccfe4e70bd00a05a37e74caa9047e9e4a6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page