Skip to main content

Library for using the Polish Wordnet in the plwnxml format

Project description

Polish Wordnet Python library

Simple, easy-to-use and reasonably fast library for using the Słowosieć (also known as PlWordNet) - a lexico-semantic database of the Polish language. PlWordNet can also be browsed here.

I created this library, because since version 2.9, PlWordNet cannot be easily loaded into Python (for example with nltk), as it is only provided in a custom plwnxml format.

Usage

Load wordnet from an XML file (this will take about 20 seconds), and print basic statistics.

import plwordnet
wn = plwordnet.load('plwordnet_4_2.xml')
print(wn)

Expected output:

PlWordnet
  lexical units: 513410
  synsets: 353586
  relation types: 306
  synset relations: 1477849
  lexical relations: 393137

Find lexical units with name leśny and print all relations, where where that unit is in the subject/parent position.

for lu in wn.lemmas('leśny'):
    for s, p, o in wn.lexical_relations_where(subject=lu):
        print(p.format(s, o))

Expected output:

leśny.2 tworzy kolokację z polana.1
leśny.2 jest synonimem mpar. do las.1
leśny.3 przypomina las.1
leśny.4 jest derywatem od las.1
leśny.5 jest derywatem od las.1
leśny.6 przypomina las.1

Print all relation types and their ids:

for id, rel in wn.relation_types.items():
    print(id, rel.name)

Expected output:

10 hiponimia
11 hiperonimia
12 antonimia
13 konwersja
...

See more usage examples in the examples notebook.

Installation

Note: plwordnet requires Python 3.7 or newer.

pip install plwordnet

Version support

This library should be able to read future versions of PlWordNet without modification, even if more relation types are added. Still, if you use this library with a version of PlWordNet that is not listed below, please consider contributing information if it is supported.

  • PlWordNet 4.2
  • PlWordNet 4.0
  • PlWordNet 3.2
  • PlWordNet 3.0
  • PlWordNet 2.3
  • PlWordNet 2.2
  • PlWordNet 2.1

Documentation

See plwordnet/wordnet.py for RelationType, Synset and LexicalUnit class definitions.

Package functions

  • load(source): Reads PlWordNet, where src is a path to the wordnet XML file, or a path to the pickled wordnet object. Passed paths can point to files compressed with gzip or lzma.

Wordnet instance properties

  • lexical_relations: List of (subject, predicate, object) triples
  • synset_relations: List of (subject, predicate, object) triples
  • relation_types: Mapping from relation type id to object
  • relation_by_name: Mapping from human readable relation name to relation ids
  • lexical_units: Mapping from lexical unit id to unit object
  • lexical_units_by_name: Mapping from lexical unit name to a set of matching lexical unit ids
  • synsets: Mapping from synset id to object
  • (lexical|synset)_relations_(s|o|p): Mapping from id of subject/object/predicate to a set of matching lexical unit/synset relation ids

Wordnet methods

  • lemmas(value): Returns a list of LexicalUnit, where the name is equal to value
  • lexical_relations_where(subject, predicate, object): Returns lexical relation triples, with matching subject or/and predicate or/and object. Subject, predicate and object arguments can be integer ids or LexicalUnit and RelationType objects.
  • synset_relations_where(subject, predicate, object): Returns synset relation triples, with matching subject or/and predicate or/and object. Subject, predicate and object arguments can be integer ids or Synset and RelationType objects.
  • hypernyms(synset, interlingual=False): Returns hypernyms of a synset (synset can be an integer id or a Synset object)
  • hyponyms(synset, interlingual=False): Returns hyponyms of a synset (synset can be an integer id or a Synset object)
  • hypernym_paths(synset, full_search=False, interlingual=False): Returns a hypernym path to a synset with no hypernyms (or all possible paths if full_search=True)
  • dump(dst): Pickles the Wordnet object to opened file dst or to a new file with path dst.

RelationType methods

  • format(x, y, short=False): Substitutes x and y into the RelationType display format display. If short, x and y are separated by the short relation name shortcut.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

plwordnet-0.1.5.tar.gz (13.1 kB view details)

Uploaded Source

Built Distribution

plwordnet-0.1.5-py2.py3-none-any.whl (12.1 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file plwordnet-0.1.5.tar.gz.

File metadata

  • Download URL: plwordnet-0.1.5.tar.gz
  • Upload date:
  • Size: 13.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.25.1 requests-toolbelt/0.9.1 urllib3/1.26.6 tqdm/4.61.2 importlib-metadata/4.10.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.5

File hashes

Hashes for plwordnet-0.1.5.tar.gz
Algorithm Hash digest
SHA256 96c407b360cf66a2f3c5e612cc0251a40bef86b9f6af3be5a6a30b76cfd2e9a6
MD5 417b924f4621df40167b7ee635024975
BLAKE2b-256 abd121aab93099d3f207bf56036f7aa094536320f61c2f2f2f0a4b1f39aa06fc

See more details on using hashes here.

File details

Details for the file plwordnet-0.1.5-py2.py3-none-any.whl.

File metadata

  • Download URL: plwordnet-0.1.5-py2.py3-none-any.whl
  • Upload date:
  • Size: 12.1 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.25.1 requests-toolbelt/0.9.1 urllib3/1.26.6 tqdm/4.61.2 importlib-metadata/4.10.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.5

File hashes

Hashes for plwordnet-0.1.5-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 385cb97b5a32cbb74b4728c927acf8a8a6605fb866ad265b768c8d3d5ccd0f79
MD5 10757f44ea7202fdaa24fda290350573
BLAKE2b-256 19d76dd4caa97bb125423c004aa7cb8c3ae9154859ddfb9e113235a388508988

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page