Skip to main content

A library for African WordNet.

Project description

AfricanWordNet: Implementation of WordNets for African languages

This library extends OMW implemented in NLTK to add support for the following African languages.

  • Sepedi (nso)
  • Xitsonga (tsn)
  • Tshivenda (ven)
  • isiZulu (zul)
  • isiXhosa (xho)

License: CC BY-NC-SA 4.0 GitHub release Wheel python [TotalDownloads]

Requirements

Installation

  • From Pypi
    • pip install africanwordnet
  • From source
    • pip install https://github.com/JosephSefara/AfricanWordNet.git

Citation Paper

@inproceedings{sefara2020practical,
  title={Paper Title},
  author={Sefara, Tshephisho and Mokgonyane, Tumisho and Marivate, Vukosi},
  booktitle={Proceedings of the Eleventh Global Wordnet Conference},
  paages={},
  year={2020},
}

Usage

>>> from nltk.corpus import wordnet as wn
>>> import africanwordnet

>>> wn.langs()
['nso', 'tsn', 'ven', 'zul', 'xho']

Setswana WordNet

>>> wn.synsets('phêpafatsa',lang=('tsn'))
[Synset('scavenge.v.04'),
 Synset('tidy.v.01'),
 Synset('refine.v.04'),
 Synset('refine.v.03'),
 Synset('purify.v.01'),
 Synset('purge.v.04'),
 Synset('purify.v.02'),
 Synset('clean.v.08'),
 Synset('clean.v.01'),
 Synset('houseclean.v.01')]

>>> wn.lemmas('phêpafatsa', lang='tsn')
[Lemma('scavenge.v.04.phêpafatsa'),
 Lemma('tidy.v.01.phêpafatsa'),
 Lemma('refine.v.04.phêpafatsa'),
 Lemma('refine.v.03.phêpafatsa'),
 Lemma('purify.v.01.phêpafatsa'),
 Lemma('purge.v.04.phêpafatsa'),
 Lemma('purify.v.02.phêpafatsa'),
 Lemma('clean.v.08.phêpafatsa'),
 Lemma('clean.v.01.phêpafatsa'),
 Lemma('houseclean.v.01.phêpafatsa')]

>>> wn.synset('purify.v.01').lemma_names('tsn')
['phêpafatsa']

>>> lemma = wn.lemma('purify.v.01.phêpafatsa', lang='tsn')
>>> whole_lemma.lang()
'tsn'

Sepedi WordNet

>>> wn.synsets('taelo',lang=('nso'))
[Synset('call.n.12'),
 Synset('mandate.n.03'),
 Synset('command.n.01'),
 Synset('order.n.01'),
 Synset('commission.n.06'),
 Synset('commandment.n.01'),
 Synset('directive.n.01'),
 Synset('injunction.n.01')]

>>> wn.lemmas('taelo', lang='nso')
[Lemma('call.n.12.taelo'),
 Lemma('mandate.n.03.taelo'),
 Lemma('command.n.01.taelo'),
 Lemma('order.n.01.taelo'),
 Lemma('commission.n.06.taelo'),
 Lemma('commandment.n.01.taelo'),
 Lemma('directive.n.01.taelo'),
 Lemma('injunction.n.01.taelo')]

>>> wn.synset('call.n.12').lemma_names('nso')
['taelo']

>>> lemma = wn.lemma('call.n.12.taelo', lang='nso')
>>> whole_lemma.lang()
'nso'

isiZulu WordNet

>>> wn.synsets('iqoqo', lang='zul')
[Synset('whole.n.02'),
 Synset('conspectus.n.01'),
 Synset('overview.n.01'),
 Synset('sketch.n.03'),
 Synset('compilation.n.01'),
 Synset('collection.n.01'),
 Synset('team.n.02'),
 Synset('set.n.01')]

>>> wn.lemmas('iqoqo', lang='zul')
[Lemma('whole.n.02.iqoqo'),
 Lemma('conspectus.n.01.iqoqo'),
 Lemma('overview.n.01.iqoqo'),
 Lemma('sketch.n.03.iqoqo'),
 Lemma('compilation.n.01.iqoqo'),
 Lemma('collection.n.01.iqoqo'),
 Lemma('team.n.02.iqoqo'),
 Lemma('set.n.01.iqoqo')]

>>> wn.synset('whole.n.02').lemma_names('zul')
['iqoqo']

>>> whole_lemma = wn.lemma('whole.n.02.iqoqo', lang='zul')
>>> whole_lemma.lang()
'zul'

isiXhosa WordNet

>>> wn.synsets('imali',lang=('xho'))
[Synset('finance.n.03'),
 Synset('wealth.n.04'),
 Synset('capital.n.01'),
 Synset('store.n.02'),
 Synset('credit.n.02'),
 Synset('money.n.01'),
 Synset('currency.n.01'),
 Synset('purse.n.02'),
 Synset('franc.n.01'),
 Synset('cent.n.01')]

>>> wn.lemmas('imali', lang='xho')
[Lemma('finance.n.03.imali'),
 Lemma('wealth.n.04.imali'),
 Lemma('capital.n.01.imali'),
 Lemma('store.n.02.imali'),
 Lemma('credit.n.02.imali'),
 Lemma('money.n.01.imali'),
 Lemma('currency.n.01.imali'),
 Lemma('purse.n.02.imali'),
 Lemma('franc.n.01.imali'),
 Lemma('cent.n.01.imali')]

>>> wn.synset('wealth.n.04').lemma_names('xho')
['imali']

>>> lemma = wn.lemma('wealth.n.04.imali', lang='xho')
>>> lemma.lang()
'xho'

Tshivenda WordNet

>>> wn.synsets('tshifanyiso',lang=('ven'))
[Synset('picture.n.05'), 
 Synset('word_picture.n.01'), 
 Synset('portrayal.n.01')]

>>> wn.lemmas('tshifanyiso', lang='ven')
[Lemma('picture.n.05.tshifanyiso'),
 Lemma('word_picture.n.01.tshifanyiso'),
 Lemma('portrayal.n.01.tshifanyiso')]

>>> wn.synset('picture.n.05').lemma_names('ven')
['tshifanyiso']

>>> lemma = wn.lemma('picture.n.05.tshifanyiso', lang='ven')
>>> whole_lemma.lang()
'ven'

Find related words

The word taelo in Sepedi is related to

  • tagafalo
  • molao
  • tlhalošo
words = set()
synsets = wn.synsets('taelo',lang=('nso'))
for synset in synsets: # synset is in english
     for hypo in synset.hyponyms():
        for lemma in hypo.lemmas("nso"):
            words.add(lemma.name())
print('taelo', '---', words)

taelo --- {'taelo', 'tagafalo', 'molao', 'tlhalošo'}

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

africanwordnet-0.0.1.tar.gz (4.5 kB view details)

Uploaded Source

Built Distribution

africanwordnet-0.0.1-py3-none-any.whl (5.0 kB view details)

Uploaded Python 3

File details

Details for the file africanwordnet-0.0.1.tar.gz.

File metadata

  • Download URL: africanwordnet-0.0.1.tar.gz
  • Upload date:
  • Size: 4.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.7

File hashes

Hashes for africanwordnet-0.0.1.tar.gz
Algorithm Hash digest
SHA256 4c8d4c69493888a628c5ea1090aabbbcce9945dfd55987bdf0fd1432e4b0a8fa
MD5 be02ca764f904fb2653fe2293362e56c
BLAKE2b-256 cd37176176368105595e22c1d9924bdf018d369005b81679a28089d7495b99e0

See more details on using hashes here.

File details

Details for the file africanwordnet-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: africanwordnet-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 5.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.7

File hashes

Hashes for africanwordnet-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 c978a52cb5a7ffe72f87630a3725a244acd46993ef40b3acf709d620495e436c
MD5 36311556915334774fc95f67decf9339
BLAKE2b-256 03f43d85ef2a03169d470c0ba746b2d313a3f691423f59692ec45a91f9698ea5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page