Skip to main content

Transformer/LLM-based zero and few-shot classification in scikit-learn pipelines

Project description

stormtrooper


Zero/few shot learning components for scikit-learn pipelines with large-language models and transformers.

Documentation

New in 1.0.0

Trooper

The brand new Trooper interface allows you not to have to specify what model type you wish to use. Stormtrooper will automatically detect the model type from the specified name.

from stormtrooper import Trooper

# This loads a setfit model
model = Trooper("all-MiniLM-L6-v2")

# This loads an OpenAI model
model = Trooper("gpt-4")

# This loads a Text2Text model
model = Trooper("google/flan-t5-base")

Unified zero and few-shot classification

You no longer have to specify whether a model should be a few or a zero-shot classifier when initialising it. If you do not pass any training examples, it will be automatically assumed that the model should be zero-shot.

# This is a zero-shot model
model.fit(None, ["dog", "cat"])

# This is a few-shot model
model.fit(["he was a good boy", "just lay down on my laptop"], ["dog", "cat"])

Model types

You can use all sorts of transformer models for few and zero-shot classification in Stormtrooper.

  1. Instruction fine-tuned generative models, e.g. Trooper("HuggingFaceH4/zephyr-7b-beta")
  2. Encoder models with SetFit, e.g. Trooper("all-MiniLM-L6-v2")
  3. Text2Text models e.g. Trooper("google/flan-t5-base")
  4. OpenAI models e.g. Trooper("gpt-4")
  5. NLI models e.g. Trooper("facebook/bart-large-mnli")

Example usage

Find more in our docs.

pip install stormtrooper
from stormtrooper import Trooper

class_labels = ["atheism/christianity", "astronomy/space"]
example_texts = [
    "God came down to earth to save us.",
    "A new nebula was recently discovered in the proximity of the Oort cloud."
]
new_texts = ["God bless the reailway workers", "The frigate is ready to launch from the spaceport"]

# Zero-shot classification
model = Trooper("google/flan-t5-base")
model.fit(None, class_labels)
model.predict(new_texts)
# ["atheism/christianity", "astronomy/space"]

# Few-shot classification
model = Trooper("google/flan-t5-base")
model.fit(example_texts, class_labels)
model.predict(new_texts)
# ["atheism/christianity", "astronomy/space"]

Fuzzy Matching

Generative and text2text models by default will fuzzy match results to the closest class label, you can disable this behavior by specifying fuzzy_match=False.

If you want fuzzy matching speedup, you should install python-Levenshtein.

Inference on GPU

From version 0.2.2 you can run models on GPU. You can specify the device when initializing a model:

classifier = Trooper("all-MiniLM-L6-v2", device="cuda:0")

Inference on multiple GPUs

You can run a model on multiple devices in order of device priority GPU -> CPU + Ram -> Disk and on multiple devices by using the device_map argument. Note that this only works with text2text and generative models.

model = Trooper("HuggingFaceH4/zephyr-7b-beta", device_map="auto")

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stormtrooper-1.0.1.tar.gz (12.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

stormtrooper-1.0.1-py3-none-any.whl (17.8 kB view details)

Uploaded Python 3

File details

Details for the file stormtrooper-1.0.1.tar.gz.

File metadata

  • Download URL: stormtrooper-1.0.1.tar.gz
  • Upload date:
  • Size: 12.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.11.5 Linux/5.15.0-124-generic

File hashes

Hashes for stormtrooper-1.0.1.tar.gz
Algorithm Hash digest
SHA256 db6c2bc265632a0c555178a2289ec623b0bcc0c01e897d5d3e2a0b64820a448a
MD5 98378dc3f9d6953e1075e300ed927eda
BLAKE2b-256 8d4548e417371c27e190fa7f54e37277b556d6d64a9fbe2150bbf0128c471c41

See more details on using hashes here.

File details

Details for the file stormtrooper-1.0.1-py3-none-any.whl.

File metadata

  • Download URL: stormtrooper-1.0.1-py3-none-any.whl
  • Upload date:
  • Size: 17.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.0 CPython/3.9.13 Linux/5.15.0-124-generic

File hashes

Hashes for stormtrooper-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 fe27d831039d04c76741a40855295668f731762a86d79ac29a3d8809ede946a4
MD5 bf16721a49a8d4ec374719b2594dd9cd
BLAKE2b-256 66b3182f707e070bac471c4166d980f11b16a512b2f322a9ffebe71c39a11acc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page