Skip to main content

Python implementation for text classification inference with CamemBERT fine-tuned models

Project description

infer-camembert

Python implementation for text classification inference with CamemBERT fine-tuned models

PyPI GitHub tag (latest by date) GitHub last commit GitHub issues GitHub

This is a simple Python implementation for the inference step of a fine-tuned text classification model based on Transformer's camembert-base model and saved in HuggingFace™.

Usage

$ pip install infer-camembert

For a private model, you must provide your HuggingFace token, either as an environment variable or under the ~/.huggingface folder:

$ HUGGINGFACE_TOKEN=<value> python3 -m infercamembert --input=example.jsonl --dictionary=labels.json --model="your-public-or-private-model-on-huggingface" --threshold=0.1 > results.jsonl

Inputs must be in the form of a dict with the keys being your unique IDs and the values the text on which to perform inference, eg.

{
  "id1": "Very nice time spent in a gorgeous site.",
  "id2": "Still a problem after three years: intolerable!!!!!!",
}

The same thing goes for the dictionary of labels where the keys should be your short custom labels and the value their corresponding long labels, eg.

{
  "label0": "undefined",
  "label1": "pleasure",
  "label2": "fun",
  "label3": "anger",
}

The results are presented as an array of predictions per input line, eg.

[
  {
    "id": "id1",
    "text": "Very nice time spent in a gorgeous site.", 
    "labels": [
      "pleasure",
      "fun"
    ]
  },
  {
    "id": "id2",
    "text": "Still a problem after three years: intolerable!!!!!!",
    "labels": [
      "anger"
    ]
  }
]

Used as a Python library:

from infercamembert import infer, Labels, ModelParameters

inputs = {
    "id1": "Very nice time spent in a gorgeous site.",
    "id2": "Still a problem after three years: intolerable!!!!!!",
}
labels = Labels(
    {
        "label0": "undefined",
        "label1": "pleasure",
        "label2": "fun",
        "label3": "anger",
    }
)
params = ModelParameters("your-public-or-private-model-on-huggingface", 0.1)
outputs = infer(inputs, labels, params)

License

This module is distributed under a MIT license.
See the LICENSE file.


© 2024 Cyril Dever. All rights reserved.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

infer-camembert-0.2.0.tar.gz (5.1 kB view details)

Uploaded Source

Built Distribution

infer_camembert-0.2.0-py3-none-any.whl (7.1 kB view details)

Uploaded Python 3

File details

Details for the file infer-camembert-0.2.0.tar.gz.

File metadata

  • Download URL: infer-camembert-0.2.0.tar.gz
  • Upload date:
  • Size: 5.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.31.0 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.11.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.3 CPython/3.10.2

File hashes

Hashes for infer-camembert-0.2.0.tar.gz
Algorithm Hash digest
SHA256 2652fcac2c8f71f8537e0de4b69588d211637d050a9a992725ec4386c2089146
MD5 38eb7e06e6c2505e8204bf3c97542f39
BLAKE2b-256 d29ff3dcac3c2a62d35d51f45aabae8c1b2aa0c157ed353d5ad1aa178d6b0f2a

See more details on using hashes here.

File details

Details for the file infer_camembert-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: infer_camembert-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 7.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.31.0 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.11.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.3 CPython/3.10.2

File hashes

Hashes for infer_camembert-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 ebc13d58c1fa9d75e43d7ca01e9c3dc5ec6bef6d2a2b6b16741ddafdad6c8317
MD5 f1acf19b8856a2b87dd83948fff475fa
BLAKE2b-256 835a813906e9f6b26cdfb2b90624f318a17d91f7be695716cf4908507b2741a2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page