Skip to main content

Demo library

Project description

Semantic similarity computation with different metrics

DescriptionInstallationUsageLicense


Description

TaxoVec is a semantic similarity library for Python which implements the state-of-the-art semantic similarity metrics like Resnik, JCN and HSS.

Requirements

  • Python 3.6 or later
  • NLTK
  • NumPy
  • Pandas

Installation

There are several ways to install TaxoVec, the recommended method is to use pip(the Python package manager) in the following way:

pip install TaxoVec==0.1.0

Usage

Using Wikipedia copus for calculating the Information content:

from TaxoVec.functions import semantic_similarity
semantic_similarity('cat', 'dog', 'resnik')

6.169410755220327

Calculating Information Conent from a given corpus:

from TaxoVec.calculate_IC import calculate_IC
from TaxoVec.functions import semantic_similarity

calculate_IC(path_to_corpus, path_to_save_IC_file)
semantic_similarity('cat', 'dog', 'resnik', path_to_save_IC_file)

Semantic similarity functions

The function semantic_similarity(word1, word2, kind, ic) has these options for the argument kind:

  • hss -> HSS
  • wup -> WUP
  • lcs -> LC
  • path_sim -> Shortest Path
  • resnik -> Resnik
  • jcn -> Jiang-Conrath
  • lin -> Lin
  • seco -> Seco

Benchmark

HSS (ours) HSS (ours) WUP WUP LC LC Shortest Path Shortest Path Resnik Resnik Jiang-Conrath Jiang-Conrath Lin Lin Seco Seco
Pearson Spearman Pearson Spearman Pearson Spearman Pearson Spearman Pearson Spearman Pearson Spearman Pearson Spearman Pearson Spearman
MEN 0.41 0.33 0.36 0.33 0.14 0.05 0.07 0.03 0.05 0.03 -0.05 -0.04 0.05 0.04 -0.01 0.03
MC30 0.74 0.69 0.74 0.73 0.33 0.21 0.22 0.3 0.13 0.03 -0.06 -0.01 0.05 0.01 0.13 -0.09
WSS 0.68 0.65 0.58 0.59 0.36 0.23 0.16 0.1 0.02 -0.03 0.04 0.06 0.03 0.06 -0.01 -0.04
Simlex999 0.4 0.38 0.45 0.43 0.26 0.15 0.2 0.16 -0.04 -0.04 0.12 0.14 0.12 0.14 -0.02 -0.08
MT287 0.46 0.31 0.4 0.28 0.26 0.12 0.11 0.11 0.03 0.04 0.18 0.16 0.22 0.17 0 -0.06
MT771 0.44 0.4 0.43 0.49 0.06 0.02 0.1 0.13 0 -0.01 0 0 0 0 -0.05 -0.03
Time per pair (s) 0.0007 0.0007 0.008 0.008 0.0055 0.0055 0.0064 0.0064 0.5586 0.5586 0.551 0.551 0.5866 0.5866 0.0013 0.0013

License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

TaxoVec-0.1.1.tar.gz (37.4 MB view details)

Uploaded Source

Built Distribution

TaxoVec-0.1.1-py2-none-any.whl (38.0 MB view details)

Uploaded Python 2

File details

Details for the file TaxoVec-0.1.1.tar.gz.

File metadata

  • Download URL: TaxoVec-0.1.1.tar.gz
  • Upload date:
  • Size: 37.4 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.24.0 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.8.3

File hashes

Hashes for TaxoVec-0.1.1.tar.gz
Algorithm Hash digest
SHA256 f3d8216196f9adbf5b62d83890064f7a84fb6b281d976554c9199b1c886edad9
MD5 61eadc05e6b02704e0da9b2fcbd28479
BLAKE2b-256 5bdfc7c0abda34eceb84b46a141fd02f2e5b727e012bf467c2ae15fe295894ed

See more details on using hashes here.

File details

Details for the file TaxoVec-0.1.1-py2-none-any.whl.

File metadata

  • Download URL: TaxoVec-0.1.1-py2-none-any.whl
  • Upload date:
  • Size: 38.0 MB
  • Tags: Python 2
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.24.0 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.8.3

File hashes

Hashes for TaxoVec-0.1.1-py2-none-any.whl
Algorithm Hash digest
SHA256 ecb215a63a9e810ea15e3e4bae5e315f85c1387d2e33885d039d94ad85ec880c
MD5 0ea60fe9009c3587788edb615310d3bb
BLAKE2b-256 bd6c3753d3d39bb3c2200e62750b7ff89f98d1e1d809446ca483c65857c8b67e

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page