Skip to main content

A fast Python implementation of the extended LESK algorithm for Word-Sense Disambiguation (WSD)

Project description

Le's Lesk

A fast Python 3 Word-Sense Disambiguation package (WSD) using the extended LESK algorithm

Install

lelesk is available on PyPI and can be installed using pip

pip install lelesk

Lelesk uses NLTK lemmatizer and yawlib wordnet API. To install NLTK data, start a Python prompt, import nltk and then run the download command (only the book package is required)

$ python3
Python 3.6.9 (default, Jan 26 2021, 15:33:00) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import nltk
>>> nltk.download("book")

Download yawlib databases and extract all the db files to ~/wordnet. For more information:

Command-line tools

To disambiguate a sentence, run this command on the terminal:

python3 -m lelesk wsd "I go to the bank to get money."

To perform word-sense disambiguation on a text file, prepare a text file with each line is a sentence.

For example here is the content of the file demo.txt

I go to the bank to withdraw money.
I sat at the river bank.

you then can run the following command

# output to TTL/JSON (a single file)
python3 -m lelesk file demo.txt demo_wsd_output.json --ttl json

# output to TTL/TSV (multiple TSV files)
python3 -m lelesk file demo.txt demo_wsd_output.json --ttl tsv

Issues

If you have any issue, please report at https://github.com/letuananh/lelesk/issues

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lelesk-0.1a1.tar.gz (15.5 kB view details)

Uploaded Source

File details

Details for the file lelesk-0.1a1.tar.gz.

File metadata

  • Download URL: lelesk-0.1a1.tar.gz
  • Upload date:
  • Size: 15.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.6.9

File hashes

Hashes for lelesk-0.1a1.tar.gz
Algorithm Hash digest
SHA256 a3dcb6469e14f1b4d415c8e5ca08ff5fa9f170c2b263a7fc444c7206ef80961d
MD5 0505b5ce6f1218a43573a2c6d3d040e2
BLAKE2b-256 4dffd8385ed3320471a2468db577f9929775b136cc2d1a65173a581244264027

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page