Skip to main content

Generalist and Lightweight Model for Text Classification

Project description

⭐ GLiClass: Generalist and Lightweight Model for Sequence Classification

This is an efficient zero-shot classifier inspired by GLiNER work. It demonstrates the same performance as a cross-encoder while being more compute-efficient because classification is done at a single forward path.

It can be used for topic classification, sentiment analysis and as a reranker in RAG pipelines.

Instalation:

pip install gliclass

How to use:

from gliclass import GLiClassModel, ZeroShotClassificationPipeline
from transformers import AutoTokenizer

model = GLiClassModel.from_pretrained("knowledgator/gliclass-small-v1.0")
tokenizer = AutoTokenizer.from_pretrained("knowledgator/gliclass-small-v1.0")

pipeline = ZeroShotClassificationPipeline(model, tokenizer, classification_type='multi-label', device='cuda:0')

text = "One day I will see the world!"
labels = ["travel", "dreams", "sport", "science", "politics"]
results = pipeline(text, labels, threshold=0.5)[0] #because we have one text

for result in results:
 print(result["label"], "=>", result["score"])

How to train:

Prepare training data in the following format: [ {"text": "Some text here!", "all_labels": ["sport", "science", "business", ...], "true_labels": ["other"]}, ... ]

Specify your training parameters in the train.py script.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gliclass-0.1.6.tar.gz (18.4 kB view details)

Uploaded Source

Built Distribution

gliclass-0.1.6-py3-none-any.whl (21.4 kB view details)

Uploaded Python 3

File details

Details for the file gliclass-0.1.6.tar.gz.

File metadata

  • Download URL: gliclass-0.1.6.tar.gz
  • Upload date:
  • Size: 18.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.2

File hashes

Hashes for gliclass-0.1.6.tar.gz
Algorithm Hash digest
SHA256 61ab8f41ec34808a614fb0f70e9b509ad3c3f620118d85ad386dacf46f923905
MD5 b0f0e55b7c2f483448946d892e1483cc
BLAKE2b-256 39616937714e5ef00e4e28db01d42dd2ec80e1883dcfa448a74587ffbf7f56bc

See more details on using hashes here.

File details

Details for the file gliclass-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: gliclass-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 21.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.2

File hashes

Hashes for gliclass-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 15efff2e55b936c2d031da58cac7eac59774a7d2a4237bd889e614f777a4b6ae
MD5 e9ec22dad8ea2c20d009ea6e305ea913
BLAKE2b-256 dbab9166546e9a13b5d453755c915983fca7593b20f95bb52e91aa936b24a88d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page