Language Model based sentences scoring library
Project description
lm-scorer
📃 Language Model based sentences scoring library
Synopsis
This package provides a simple programming interface to score sentences using different ML language models.
A simple CLI is also available for quick prototyping.
Install
pip install lm-scorer
Usage
from lm_scorer.models.auto import AutoLMScorer as LMScorer
LMScorer.supported_model_names()
# => ["gpt2", "gpt2-medium", "gpt2-large", "gpt2-xl", distilgpt2"]
scorer = LMScorer.from_pretrained("gpt2")
scorer.score("I like this package.")
# => -25.835
scorer.score("I like this package.", return_tokens=True)
# => -25.835, {
# "I": -3.9997,
# "Ġlike": -5.0142,
# "Ġthis": -2.5178,
# "Ġpackage": -7.4062,
# ".": -1.2812,
# "<|endoftext|>": -5.6163,
# }
scorer.score("I like this package.", return_log_prob=False)
# => 6.0231e-12
scorer.score("I like this package.", return_log_prob=False, return_tokens=True)
# => 6.0231e-12, {
# "I": 0.018321,
# "Ġlike": 0.0066431,
# "Ġthis": 0.080633,
# "Ġpackage": 0.00060745,
# ".": 0.27772,
# "<|endoftext|>": 0.0036381,
# }
CLI
The pip package includes a CLI that you can use to score sentences.
usage: lm-scorer [-h] [--model-name MODEL_NAME] [--tokens] [--log-prob]
[--debug]
sentences-file-path
Get sentences probability using a language model.
positional arguments:
sentences-file-path A file containing sentences to score, one per line. If
- is given as filename it reads from stdin instead.
optional arguments:
-h, --help show this help message and exit
--model-name MODEL_NAME, -m MODEL_NAME
The pretrained language model to use. Can be one of:
gpt2, gpt2-medium, gpt2-large, gpt2-xl, distilgpt2.
--tokens, -t If provided it provides the probability of each token
of each sentence.
--log-prob, -lp If provided log probabilities are returned instead.
--debug If provided it provides additional logging in case of
errors.
Authors
- Simone Primarosa - simonepri
See also the list of contributors who participated in this project.
License
This project is licensed under the MIT License - see the license file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
lm-scorer-0.1.1.tar.gz
(8.7 kB
view hashes)
Built Distribution
Close
Hashes for lm_scorer-0.1.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3078e26b38c876bcfbdec73f5a00c59778aab3e0e101fd469b069571f14b497f |
|
MD5 | 00f527a068d949a5fd5caf515575be9e |
|
BLAKE2b-256 | 55bd51d7637c574e28aedc9c3f8a53df37a1b584e73dd4b0086583e084dcbe97 |