Library for using the Polish Wordnet in the plwnxml format
Project description
Polish Wordnet Python library
Simple, easy-to-use and reasonably fast library for using the Słowosieć (also known as PlWordNet) - a lexico-semantic database of the Polish language. PlWordNet can also be browsed here.
I created this library, because since version 2.9, PlWordNet cannot be easily loaded into Python (for example with nltk), as it is only provided in a custom plwnxml format.
Usage
Load wordnet from an XML file (this will take about 20 seconds), and print basic statistics.
import plwordnet
wn = plwordnet.load('plwordnet_4_2.xml')
print(wn)
Expected output:
PlWordnet
lexical units: 513410
synsets: 353586
relation types: 306
synset relations: 1477849
lexical relations: 393137
Find lexical units with name leśny and print all relations, where where that unit is in the subject/parent position.
for lu in wn.lemmas('leśny'):
for s, p, o in wn.lexical_relations_where(subject=lu):
print(p.format(s, o))
Expected output:
leśny.2 tworzy kolokację z polana.1
leśny.2 jest synonimem mpar. do las.1
leśny.3 przypomina las.1
leśny.4 jest derywatem od las.1
leśny.5 jest derywatem od las.1
leśny.6 przypomina las.1
Print all relation types and their ids:
for id, rel in wn.relation_types.items():
print(id, rel.name)
Expected output:
10 hiponimia
11 hiperonimia
12 antonimia
13 konwersja
...
See more usage examples in the examples notebook.
Installation
Note: plwordnet requires Python 3.7 or newer.
pip install plwordnet
Version support
This library should be able to read future versions of PlWordNet without modification, even if more relation types are added. Still, if you use this library with a version of PlWordNet that is not listed below, please consider contributing information if it is supported.
- PlWordNet 4.2
- PlWordNet 4.0
- PlWordNet 3.2
- PlWordNet 3.0
- PlWordNet 2.3
- PlWordNet 2.2
- PlWordNet 2.1
Documentation
See plwordnet/wordnet.py for RelationType, Synset and LexicalUnit class definitions.
Package functions
load(source): Reads PlWordNet, wheresrcis a path to the wordnet XML file, or a path to the pickled wordnet object. Passed paths can point to files compressed with gzip or lzma.
Wordnet instance properties
lexical_relations: List of (subject, predicate, object) triplessynset_relations: List of (subject, predicate, object) triplesrelation_types: Mapping from relation type id to objectrelation_by_name: Mapping from human readable relation name to relation idslexical_units: Mapping from lexical unit id to unit objectlexical_units_by_name: Mapping from lexical unit name to a set of matching lexical unit idssynsets: Mapping from synset id to object(lexical|synset)_relations_(s|o|p): Mapping from id of subject/object/predicate to a set of matching lexical unit/synset relation ids
Wordnet methods
lemmas(value): Returns a list ofLexicalUnit, where the name is equal tovaluelexical_relations_where(subject, predicate, object): Returns lexical relation triples, with matching subject or/and predicate or/and object. Subject, predicate and object arguments can be integer ids orLexicalUnitandRelationTypeobjects.synset_relations_where(subject, predicate, object): Returns synset relation triples, with matching subject or/and predicate or/and object. Subject, predicate and object arguments can be integer ids orSynsetandRelationTypeobjects.hypernyms(synset, interlingual=False): Returns hypernyms of a synset (synsetcan be an integer id or aSynsetobject)hyponyms(synset, interlingual=False): Returns hyponyms of a synset (synsetcan be an integer id or aSynsetobject)hypernym_paths(synset, full_search=False, interlingual=False): Returns a hypernym path to a synset with no hypernyms (or all possible paths iffull_search=True)dump(dst): Pickles theWordnetobject to opened filedstor to a new file with pathdst.
RelationType methods
format(x, y, short=False): Substitutesxandyinto theRelationTypedisplay formatdisplay. Ifshort,xandyare separated by the short relation nameshortcut.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file plwordnet-0.1.5.tar.gz.
File metadata
- Download URL: plwordnet-0.1.5.tar.gz
- Upload date:
- Size: 13.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.25.1 requests-toolbelt/0.9.1 urllib3/1.26.6 tqdm/4.61.2 importlib-metadata/4.10.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
96c407b360cf66a2f3c5e612cc0251a40bef86b9f6af3be5a6a30b76cfd2e9a6
|
|
| MD5 |
417b924f4621df40167b7ee635024975
|
|
| BLAKE2b-256 |
abd121aab93099d3f207bf56036f7aa094536320f61c2f2f2f0a4b1f39aa06fc
|
File details
Details for the file plwordnet-0.1.5-py2.py3-none-any.whl.
File metadata
- Download URL: plwordnet-0.1.5-py2.py3-none-any.whl
- Upload date:
- Size: 12.1 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.25.1 requests-toolbelt/0.9.1 urllib3/1.26.6 tqdm/4.61.2 importlib-metadata/4.10.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
385cb97b5a32cbb74b4728c927acf8a8a6605fb866ad265b768c8d3d5ccd0f79
|
|
| MD5 |
10757f44ea7202fdaa24fda290350573
|
|
| BLAKE2b-256 |
19d76dd4caa97bb125423c004aa7cb8c3ae9154859ddfb9e113235a388508988
|