Language Model based sentences scoring library
Project description
lm-scorer
📃 Language Model based sentences scoring library
Synopsis
This package provides a simple programming interface to score sentences using different ML language models.
A simple CLI is also available for quick prototyping.
Install
pip install lm-scorer
Usage
from lm_scorer.models.auto import AutoLMScorer as LMScorer
LMScorer.supported_model_names()
# => ["gpt2", "gpt2-medium", "gpt2-large", "gpt2-xl", distilgpt2"]
scorer = LMScorer.from_pretrained("gpt2")
scorer.score("I like this package.")
# => -25.835
scorer.score("I like this package.", return_tokens=True)
# => -25.835, {
# "I": -3.9997,
# "Ġlike": -5.0142,
# "Ġthis": -2.5178,
# "Ġpackage": -7.4062,
# ".": -1.2812,
# "<|endoftext|>": -5.6163,
# }
scorer.score("I like this package.", return_log_prob=False)
# => 6.0231e-12
scorer.score("I like this package.", return_log_prob=False, return_tokens=True)
# => 6.0231e-12, {
# "I": 0.018321,
# "Ġlike": 0.0066431,
# "Ġthis": 0.080633,
# "Ġpackage": 0.00060745,
# ".": 0.27772,
# "<|endoftext|>": 0.0036381,
# }
CLI
The pip package includes a CLI that you can use to score sentences.
usage: lm-scorer [-h] [--model-name MODEL_NAME] [--tokens] [--log-prob]
[--debug]
sentences-file-path
Get sentences probability using a language model.
positional arguments:
sentences-file-path A file containing sentences to score, one per line. If
- is given as filename it reads from stdin instead.
optional arguments:
-h, --help show this help message and exit
--model-name MODEL_NAME, -m MODEL_NAME
The pretrained language model to use. Can be one of:
gpt2, gpt2-medium, gpt2-large, gpt2-xl, distilgpt2.
--tokens, -t If provided it provides the probability of each token
of each sentence.
--log-prob, -lp If provided log probabilities are returned instead.
--debug If provided it provides additional logging in case of
errors.
Authors
- Simone Primarosa - simonepri
See also the list of contributors who participated in this project.
License
This project is licensed under the MIT License - see the license file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file lm-scorer-0.1.0.tar.gz.
File metadata
- Download URL: lm-scorer-0.1.0.tar.gz
- Upload date:
- Size: 8.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.0.5 CPython/3.7.7 Darwin/19.4.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
27ae5b805b072010d0f81d6294d1fc42ae9f6257821cc5b2d9aee8ba4192356b
|
|
| MD5 |
356655831e73eb6329b35864b3781e46
|
|
| BLAKE2b-256 |
3fc670801e35754723c215feb7eeb4eacd68689196032aeea9493de6a001bdd2
|
File details
Details for the file lm_scorer-0.1.0-py3-none-any.whl.
File metadata
- Download URL: lm_scorer-0.1.0-py3-none-any.whl
- Upload date:
- Size: 7.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.0.5 CPython/3.7.7 Darwin/19.4.0
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
80af612b2f6fc1e49380c0a371a83a9d8b8d6f684b2b04e5e5a0a24fc9210ef1
|
|
| MD5 |
6d07504f27a7fad50c69af3bb6eb08fe
|
|
| BLAKE2b-256 |
c4bd24e165c523e19f2161d639b283606c25f9cd715882ef3bc3950e06e06485
|