Python implementation for text classification inference with CamemBERT fine-tuned models
Project description
infer-camembert
Python implementation for text classification inference with CamemBERT fine-tuned models
This is a simple Python implementation for the inference step of a fine-tuned text classification model based on Transformer's camembert-base
model and saved in HuggingFace™.
Usage
$ pip install infer-camembert
For a private model, you must provide your HuggingFace token, either as an environment variable or under the ~/.huggingface
folder:
$ HUGGINGFACE_TOKEN=<value> python3 -m infer-camembert --input=example.jsonl --dictionary=labels.json --model="your-public-or-private-model-on-huggingface" --threshold=0.1 > results.jsonl
Inputs must be in the form of a dict
with the keys being your unique IDs and the values the text on which to perform inference, eg.
{
"id1": "Very nice time spent in a gorgeous site.",
"id2": "Still a problem after three years: intolerable!!!!!!",
}
The same thing goes for the dictionary of labels where the keys should be your short custom labels and the value their corresponding long labels, eg.
{
"label0": "undefined",
"label1": "pleasure",
"label2": "fun",
"label3": "anger",
}
The results are presented as an array of predictions per input line, eg.
[
{
"id": "id1",
"text": "Very nice time spent in a gorgeous site.",
"labels": [
"pleasure",
"fun"
]
},
{
"id": "id2",
"text": "Still a problem after three years: intolerable!!!!!!",
"labels": [
"anger"
]
}
]
Used as a Python library:
from infercamembert.inference import infer
from infercamembert.labels import Labels
from infercamembert.parameters import ModelParameters
inputs = {
"id1": "Very nice time spent in a gorgeous site.",
"id2": "Still a problem after three years: intolerable!!!!!!",
}
labels = Labels(
{
"label0": "undefined",
"label1": "pleasure",
"label2": "fun",
"label3": "anger",
}
)
params = ModelParameters("your-public-or-private-model-on-huggingface", 0.1)
outputs = infer(inputs, labels, params)
License
This module is distributed under a MIT license.
See the LICENSE file.
© 2024 Cyril Dever. All rights reserved.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for infer_camembert-0.0.1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5d8c09a41f2b3ace206b55ed7520a752719eb12516158e3ae98d88c0c6ab6234 |
|
MD5 | 95023a56211fcd621180221cc077a00c |
|
BLAKE2b-256 | 69c231b335b6ca16e8da4bf05573804e87ebb20f8d8f7860cd52a917fcb5db60 |