Add a short description here!

These details have not been verified by PyPI

Project links

Homepage

Project description

spaCy WordNet

spaCy Wordnet is a simple custom component for using WordNet, MultiWordnet and WordNet domains with spaCy.

The component combines the NLTK wordnet interface with WordNet domains to allow users to:

Get all synsets for a processed token. For example, getting all the synsets (word senses) of the word bank.
Get and filter synsets by domain. For example, getting synonyms of the verb withdraw in the financial domain.

Getting started

The spaCy WordNet component can be easily integrated into spaCy pipelines. You just need the following:

Prerequisites

Python 3.X
spaCy

You also need to install the following NLTK wordnet data:

python -m nltk.downloader wordnet
python -m nltk.downloader omw

Install

pip install spacy-wordnet

Supported languages

Almost all Open Multi Wordnet languages are supported.

Usage

Once you choose the desired language (from the list of supported ones above), you will need to manually download a spaCy model for it. Check the list of available models for each language at SpaCy 2.x or SpaCy 3.x.

English example

Download example model:

python -m spacy download en_core_web_sm

Run:

import spacy

from spacy_wordnet.wordnet_annotator import WordnetAnnotator 

# Load an spacy model
nlp = spacy.load('en_core_web_sm')
# Spacy 3.x
nlp.add_pipe("spacy_wordnet", after='tagger')
# Spacy 2.x
# nlp.add_pipe(WordnetAnnotator(nlp, name="spacy_wordnet"), after='tagger')
token = nlp('prices')[0]

# wordnet object link spacy token with nltk wordnet interface by giving acces to
# synsets and lemmas 
token._.wordnet.synsets()
token._.wordnet.lemmas()

# And automatically tags with wordnet domains
token._.wordnet.wordnet_domains()

spaCy WordNet lets you find synonyms by domain of interest for example economy

economy_domains = ['finance', 'banking']
enriched_sentence = []
sentence = nlp('I want to withdraw 5,000 euros')

# For each token in the sentence
for token in sentence:
    # We get those synsets within the desired domains
    synsets = token._.wordnet.wordnet_synsets_for_domain(economy_domains)
    if not synsets:
        enriched_sentence.append(token.text)
    else:
        lemmas_for_synset = [lemma for s in synsets for lemma in s.lemma_names()]
        # If we found a synset in the economy domains
        # we get the variants and add them to the enriched sentence
        enriched_sentence.append('({})'.format('|'.join(set(lemmas_for_synset))))

# Let's see our enriched sentence
print(' '.join(enriched_sentence))
# >> I (need|want|require) to (draw|withdraw|draw_off|take_out) 5,000 euros

Portuguese example

Download example model:

python -m spacy download pt_core_news_sm

Run:

import spacy

from spacy_wordnet.wordnet_annotator import WordnetAnnotator 

# Load an spacy model
nlp = spacy.load('pt_core_news_sm')
# Spacy 3.x
nlp.add_pipe("spacy_wordnet", after='tagger', config={'lang': nlp.lang})
# Spacy 2.x
# nlp.add_pipe(WordnetAnnotator(nlp.lang), after='tagger')
text = "Eu quero retirar 5.000 euros"
economy_domains = ['finance', 'banking']
enriched_sentence = []
sentence = nlp(text)

# For each token in the sentence
for token in sentence:
    # We get those synsets within the desired domains
    synsets = token._.wordnet.wordnet_synsets_for_domain(economy_domains)
    if not synsets:
        enriched_sentence.append(token.text)
    else:
        lemmas_for_synset = [lemma for s in synsets for lemma in s.lemma_names('por')]
        # If we found a synset in the economy domains
        # we get the variants and add them to the enriched sentence
        enriched_sentence.append('({})'.format('|'.join(set(lemmas_for_synset))))

# Let's see our enriched sentence
print(' '.join(enriched_sentence))
# >> Eu (querer|desejar|esperar) retirar 5.000 euros

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.1.0

Sep 19, 2022

0.0.5

Jun 9, 2021

0.0.5b2 pre-release

Jun 9, 2021

0.0.4

Dec 17, 2018

0.0.3

Nov 14, 2018

0.0.2

Nov 11, 2018

0.0.1

Nov 10, 2018

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spacy-wordnet-0.1.0.tar.gz (651.6 kB view details)

Uploaded Sep 19, 2022 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

spacy_wordnet-0.1.0-py2.py3-none-any.whl (652.1 kB view details)

Uploaded Sep 19, 2022 Python 2Python 3

File details

Details for the file spacy-wordnet-0.1.0.tar.gz.

File metadata

Download URL: spacy-wordnet-0.1.0.tar.gz
Upload date: Sep 19, 2022
Size: 651.6 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.1 CPython/3.9.14

File hashes

Hashes for spacy-wordnet-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`2102145ca92e37e94d775e80a05c7e303d977a5a7fd4ff4d5de6e17a251117cd`
MD5	`731f9fe4cb88c76f374b8f9db6dfd8f3`
BLAKE2b-256	`095f46b883073e4d9ab68a6635b33c6f48e8496de4feb0838eaadd38d0921e32`

See more details on using hashes here.

File details

Details for the file spacy_wordnet-0.1.0-py2.py3-none-any.whl.

File metadata

Download URL: spacy_wordnet-0.1.0-py2.py3-none-any.whl
Upload date: Sep 19, 2022
Size: 652.1 kB
Tags: Python 2, Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.1 CPython/3.9.14

File hashes

Hashes for spacy_wordnet-0.1.0-py2.py3-none-any.whl
Algorithm	Hash digest
SHA256	`ff81865fafa1bf9c84b8e741c57b5489ecb61a816b5f3316d093613dd9a6c437`
MD5	`4f84a297f5f965c29d823afa88b79d9e`
BLAKE2b-256	`206e3263cc4117399c6df0460ee7b7f69e82ccfea2a924dd92f42af2279cdad7`

See more details on using hashes here.

spacy-wordnet 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

spaCy WordNet

Getting started

Prerequisites

Install

Supported languages

Usage

English example

Portuguese example

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes