implementations of models and metrics for semantic text similarity. that's it.
an easy-to-use interface to fine-tuned BERT models for computing semantic similarity. that's it.
This project contains an interface to fine-tuned, BERT-based semantic text similarity models. It modifies pytorch-transformers by abstracting away all the research benchmarking code for ease of real-world applicability.
|Web STS BERT||STS-B||0.893|
|Clinical STS BERT||MED-STS||0.854|
Install with pip:
pip install semantic-text-similarity
pip install git+https://github.com/AndriyMulyar/semantic-text-similarity
Maps batches of sentence pairs to real-valued scores in the range [0,5]
from semantic_text_similarity.models import WebBertSimilarity from semantic_text_similarity.models import ClinicalBertSimilarity web_model = WebBertSimilarity(device='cpu', batch_size=10) #defaults to GPU prediction clinical_model = ClinicalBertSimilarity(device='cuda', batch_size=10) #defaults to GPU prediction web_model.predict([("She won an olympic gold medal","The women is an olympic champion")])
- You will need a GPU to apply these models if you would like any hint of speed in your predictions.
- Model downloads are cached in
~/.cache/torch/semantic_text_similarity/. Try clearing this folder if you have issues.
Clinical models in this project were submitted to the 2019 N2C2 Shared Task Track 1. Implementation and model training in this project was supported by funding from the Mark Dredze Lab at Johns Hopkins University.
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
|Filename, size||File type||Python version||Upload date||Hashes|
|Filename, size semantic_text_similarity-1.0.3-py3-none-any.whl (416.0 kB)||File type Wheel||Python version py3||Upload date||Hashes View|
|Filename, size semantic_text_similarity-1.0.3.tar.gz (410.5 kB)||File type Source||Python version None||Upload date||Hashes View|
Hashes for semantic_text_similarity-1.0.3-py3-none-any.whl
Hashes for semantic_text_similarity-1.0.3.tar.gz