Skip to main content

implementations of models and metrics for semantic text similarity. that's it.

Project description

semantic-text-similarity

an easy-to-use interface to fine-tuned BERT models for computing semantic similarity. that's it.

This project contains an interface to fine-tuned, BERT-based semantic text similarity models. It modifies pytorch-transformers by abstracting away all the research benchmarking code for ease of real-world applicability.

Model Dataset Dev. Correlation
Web STS BERT STS-B 0.893
Clinical STS BERT MED-STS 0.854

Installation

Install with pip:

pip install semantic-text-similarity

or directly:

pip install git+https://github.com/AndriyMulyar/semantic-text-similarity

Use

Maps batches of sentence pairs to real-valued scores in the range [0,5]

from semantic_text_similarity.models import WebBertSimilarity
from semantic_text_similarity.models import ClinicalBertSimilarity

web_model = WebBertSimilarity(device='cpu', batch_size=10) #defaults to GPU prediction

clinical_model = ClinicalBertSimilarity(device='cuda', batch_size=10) #defaults to GPU prediction

web_model.predict([("She won an olympic gold medal","The women is an olympic champion")])

More examples.

Notes

  • You will need a GPU to apply these models if you would like any hint of speed in your predictions.
  • Model downloads are cached in ~/.cache/torch/semantic_text_similarity/. Try clearing this folder if you have issues.

Acknowledgement

Clinical models in this project were submitted to the 2019 N2C2 Shared Task Track 1. Implementation and model training in this project was supported by funding from the Mark Dredze Lab at Johns Hopkins University.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

semantic_text_similarity-1.0.3.tar.gz (410.5 kB view details)

Uploaded Source

Built Distribution

semantic_text_similarity-1.0.3-py3-none-any.whl (416.0 kB view details)

Uploaded Python 3

File details

Details for the file semantic_text_similarity-1.0.3.tar.gz.

File metadata

  • Download URL: semantic_text_similarity-1.0.3.tar.gz
  • Upload date:
  • Size: 410.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.6.7

File hashes

Hashes for semantic_text_similarity-1.0.3.tar.gz
Algorithm Hash digest
SHA256 35cee23188703ce1da4dcb5ad39cccb0c0f60dbd544a58396b6431ae0f137684
MD5 30f8996a58ea8924ce448ad01109e823
BLAKE2b-256 a477a7c1f4cc37431dd4f5d6796e11a2fed55373743650babacaf480c62c1c31

See more details on using hashes here.

File details

Details for the file semantic_text_similarity-1.0.3-py3-none-any.whl.

File metadata

  • Download URL: semantic_text_similarity-1.0.3-py3-none-any.whl
  • Upload date:
  • Size: 416.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.32.2 CPython/3.6.7

File hashes

Hashes for semantic_text_similarity-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 78d4a7e3544aa1c744320286dedec8fb7bf81cca0bbea473972bac943e359e78
MD5 dda13128f964d646b350606555a4b013
BLAKE2b-256 f1d7eade8afd89103e3dcc4b4db146a134a26bd7336ba86d9a95cf0d0e3a28cb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page