Skip to main content

Python implementation for text classification inference with CamemBERT fine-tuned models

Project description

infer-camembert

Python implementation for text classification inference with CamemBERT fine-tuned models

PyPI GitHub tag (latest by date) GitHub last commit GitHub issues GitHub

This is a simple Python implementation for the inference step of a fine-tuned text classification model based on Transformer's camembert-base model and saved in HuggingFace™.

Usage

$ pip install infer-camembert

For a private model, you must provide your HuggingFace token, either as an environment variable or under the ~/.huggingface folder:

$ HUGGINGFACE_TOKEN=<value> python3 -m infercamembert --input=example.jsonl --dictionary=labels.json --model="your-public-or-private-model-on-huggingface" --threshold=0.1 > results.jsonl

Inputs must be in the form of a dict with the keys being your unique IDs and the values the text on which to perform inference, eg.

{
  "id1": "Very nice time spent in a gorgeous site.",
  "id2": "Still a problem after three years: intolerable!!!!!!",
}

The same thing goes for the dictionary of labels where the keys should be your short custom labels and the value their corresponding long labels, eg.

{
  "label0": "undefined",
  "label1": "pleasure",
  "label2": "fun",
  "label3": "anger",
}

The results are presented as an array of predictions per input line, eg.

[
  {
    "id": "id1",
    "text": "Very nice time spent in a gorgeous site.", 
    "labels": [
      "pleasure",
      "fun"
    ]
  },
  {
    "id": "id2",
    "text": "Still a problem after three years: intolerable!!!!!!",
    "labels": [
      "anger"
    ]
  }
]

Used as a Python library:

from infercamembert import infer, Labels, ModelParameters

inputs = {
    "id1": "Very nice time spent in a gorgeous site.",
    "id2": "Still a problem after three years: intolerable!!!!!!",
}
labels = Labels(
    {
        "label0": "undefined",
        "label1": "pleasure",
        "label2": "fun",
        "label3": "anger",
    }
)
params = ModelParameters("your-public-or-private-model-on-huggingface", 0.1)
outputs = infer(inputs, labels, params)

License

This module is distributed under a MIT license.
See the LICENSE file.


© 2024 Cyril Dever. All rights reserved.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

infer-camembert-0.2.0.tar.gz (5.1 kB view hashes)

Uploaded Source

Built Distribution

infer_camembert-0.2.0-py3-none-any.whl (7.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page