Python library for accessing major wordnet releases using relational databases for high performance batch processing
Project description
YAWlib - Yet Another WordNet library for Python
A Python library for accessing major WordNet releases using relational databases for high performance batch processing.
- Princeton Wordnet 3.0
- NTU Open Multilingual WordNet
- Gloss WordNet
- and more to be added in future versions
Installation
Yawlib is available on PyPI
pip install yawlib
Prebuilt database files are available on the author's Open Science Framework project page: https://osf.io/9udjk/.
Download them and extract those to your home folder at ~/wordnet/
.
On Linux it should look something like:
/home/username/wordnet/
- glosstag.db
- sqlite-30.db
- wn-ntumc.db
# or on Mac OS
/Users/username/wordnet/
- glosstag.db
- sqlite-30.db
- wn-ntumc.db
On Windows
C:\Users\<username>\wordnet\
- glosstag.db
- sqlite-30.db
- wn-ntumc.db
To verify that yawlib is working properly, you can use the info
command.
# Show yawlib information
python3 -m yawlib info
Command-line tools
yawlib
includes a command-line tool for querying wordnets directly from terminal.
For example, to search synsets by the lemma research
one may use python3 -m yawlib lemma research
python3 -m yawlib lemma research
Looking for synsets by term (Provided: research | pos = None)
〔Synset〕00636921-n 〔Lemmas〕research 〔Keys〕research%1:04:00::
------------------------------------------------------------
(def) “systematic investigation to establish facts;”
〔Synset〕05797597-n 〔Lemmas〕inquiry; enquiry; research 〔Keys〕inquiry%1:09:01:: enquiry%1:09:00:: research%1:09:00::
------------------------------------------------------------
(def) “a search for knowledge;”
(ex) their pottery deserves more research than it has received;
〔Synset〕00648224-v 〔Lemmas〕research; search; explore 〔Keys〕research%2:31:00:: search%2:31:00:: explore%2:31:00::
------------------------------------------------------------
(def) “inquire into;”
(ex) the students had to research the history of the Second World War for their history project;
(ex) He searched for information on his relatives on the web;
(ex) Scientists are exploring the nature of consciousness;
〔Synset〕00877327-v 〔Lemmas〕research 〔Keys〕research%2:32:00::
------------------------------------------------------------
(def) “attempt to find out in a systematically and scientific manner;”
(ex) The student researched the history of that word;
Found 4 synset(s)
Development
Go to yawlib folder, execute the config script and then run wntk.sh to generate the glosstab DB file.
git clone https://github.com/letuananh/yawlib
cd yawlib
# create virtual environment
python3 -m venv yawlib_py3
. yawlib_py3/bin/activate
# install required packages
pip install -r requirements.txt
pip install -r requirements-optional.txt
# to show information
python -m yawlib info
Compiling glosstag.db from source
Make sure that glosstag
source folder and sqlite-30.db
are available in ~/wordnet
.
The directory should look like this:
/home/user/wordnet
├── glosstag
│ ├── dtd
│ │ └── glosstag.dtd
│ ├── LICENSE.txt
│ ├── merged
│ │ ├── adj.xml
│ │ ├── adv.xml
│ │ ├── noun.xml
│ │ └── verb.xml
│ ├── README.txt
│ ├── standoff
│ │ ├── 00
│ │ ├── 01
│ │ ├── 02
│ │ ├── ....
│ │ ├── index.byid.tab
│ │ ├── index.bylem.adj.tab
│ │ ├── index.bylem.adv.tab
│ │ ├── index.bylem.noun.tab
│ │ ├── index.bylem.tab
│ │ ├── index.bylem.verb.tab
│ │ └── index.bysk.tab
│ └── statistics.tab
├── glosstag.db
├── sqlite-30.db
├── wn-ntumc.db
The run the create
command to generate the database
python -m yawlib create
Original sources
- WordNet 3.0 SQLite: https://sourceforge.net/projects/wnsql/files/wnsql3/sqlite/3.0/
- WordNet glosstag (XML): http://wordnet.princeton.edu/glosstag.shtml
- NTU Open Multilingual Wordnet: http://compling.hss.ntu.edu.sg/omw/
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file yawlib-0.1.tar.gz
.
File metadata
- Download URL: yawlib-0.1.tar.gz
- Upload date:
- Size: 38.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.0.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.6.9
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 1400fbc7117dfd2eb51b269bd0bd019b796a1aebde0d74f836de21f54a92f199 |
|
MD5 | c146b87fbe351b16a88547c413a81fc8 |
|
BLAKE2b-256 | 9493e601cbf7cd1d6be72e543b482c484b974487ba6c7fe5c9a0c6498f898728 |