Language Model based sentences scoring library
Project description
lm-scorer
📃 Language Model based sentences scoring library
Synopsis
This package provides a simple programming interface to score sentences using different ML language models.
A simple CLI is also available for quick prototyping.
Install
pip install lm-scorer
Usage
from lm_scorer.models.auto import AutoLMScorer as LMScorer
LMScorer.supported_model_names()
# => ["gpt2", "gpt2-medium", "gpt2-large", "gpt2-xl", distilgpt2"]
scorer = LMScorer.from_pretrained("gpt2")
scorer.score("I like this package.")
# => -25.835
scorer.score("I like this package.", return_tokens=True)
# => -25.835, {
# "I": -3.9997,
# "Ġlike": -5.0142,
# "Ġthis": -2.5178,
# "Ġpackage": -7.4062,
# ".": -1.2812,
# "<|endoftext|>": -5.6163,
# }
scorer.score("I like this package.", return_log_prob=False)
# => 6.0231e-12
scorer.score("I like this package.", return_log_prob=False, return_tokens=True)
# => 6.0231e-12, {
# "I": 0.018321,
# "Ġlike": 0.0066431,
# "Ġthis": 0.080633,
# "Ġpackage": 0.00060745,
# ".": 0.27772,
# "<|endoftext|>": 0.0036381,
# }
CLI
The pip package includes a CLI that you can use to score sentences.
usage: lm-scorer [-h] [--model-name MODEL_NAME] [--tokens] [--log-prob]
[--debug]
sentences-file-path
Get sentences probability using a language model.
positional arguments:
sentences-file-path A file containing sentences to score, one per line. If
- is given as filename it reads from stdin instead.
optional arguments:
-h, --help show this help message and exit
--model-name MODEL_NAME, -m MODEL_NAME
The pretrained language model to use. Can be one of:
gpt2, gpt2-medium, gpt2-large, gpt2-xl, distilgpt2.
--tokens, -t If provided it provides the probability of each token
of each sentence.
--log-prob, -lp If provided log probabilities are returned instead.
--debug If provided it provides additional logging in case of
errors.
Authors
- Simone Primarosa - simonepri
See also the list of contributors who participated in this project.
License
This project is licensed under the MIT License - see the license file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
lm-scorer-0.1.0.tar.gz
(8.7 kB
view hashes)
Built Distribution
Close
Hashes for lm_scorer-0.1.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 80af612b2f6fc1e49380c0a371a83a9d8b8d6f684b2b04e5e5a0a24fc9210ef1 |
|
MD5 | 6d07504f27a7fad50c69af3bb6eb08fe |
|
BLAKE2b-256 | c4bd24e165c523e19f2161d639b283606c25f9cd715882ef3bc3950e06e06485 |