Skip to main content

A zero-shot classification engine based on various LLM models

Project description

zeroshot-engine

A scientific zero-shot text classification engine based on various LLM models.

Description

This project provides a flexible framework for performing zero-shot classification using large language models and pandas. It allows you to classify text into categories without requiring explicit training data for those categories. All instructions to LLMs are provided by mere natural language prompts. The framework is designed to support a wide range of text classification tasks including multi-label, multi-class, and single-class classification scenarios.

Features

  • Handles multi-label, multi-class, and single-class classification tasks.
  • Option for incorporating few-shot learning through the flexible prompt engineering approach.
  • Supports multiple LLM models (e.g., OpenAI, Ollama).
  • Easy-to-use command-line interface for demo purposes.
  • Customizable prompts.
  • Integration with pandas for data handling.

Key Concepts

  • Zero-Shot Learning: The ability of a model to make predictions on unseen classes or tasks without prior training on those specific classes or tasks. The system learns entirely through natural language instructions, eliminating the need for labeled examples or fine-tuning.
  • Sequential Classification: A process where tasks are performed in a series of steps without strict dependencies (IDZSC approach).
  • Hierarchical Classification: A structured approach that breaks down complex classification tasks into a series of simpler decisions following a predefined hierarchy with explicit dependencies (HDZSC approach).
  • Multi-Prompting: The use of multiple different prompts for different tasks to elicit more comprehensive and reliable predictions from the model.
  • Modular Prompt Design: While not automated in the current implementation, the modular prompt design with text blocks facilitates manual testing and refinement of prompts to improve classification accuracy.

Installation

pip install zeroshot-engine

Demo

zeroshot-engine demo

Usage

zeroshot-engine --help

Core Modules

Iterative Double Validated Zero-Shot Classification (IDZSC)

IDZSC is a core module that refines zero-shot classification results through an iterative process. It uses a double validation technique to ensure the robustness and accuracy of the classifications.

Hierarchical Double Validated Zero-Shot Classification (HDZSC)

HDZSC extends the zero-shot classification capabilities to hierarchical category structures. It leverages a double validation approach to maintain accuracy while navigating the complexities of hierarchical classification.

Planned Features

  • Improved documentation and examples.
  • Create prompting guidelines.
  • Better integration and testing of validation metrics.
  • For structured benchmarking and prompt engineering approach.
  • Automated Logging System
  • Add contribution guidelines.
  • Support for more LLMs and APIs.

Documentation

For more detailed information about the framework and its implementation, please refer to the following documentation:

  • Overview of IDZSC and HDZSC - A comprehensive explanation of the Iterative and Hierarchical Double Zero-Shot Classification approaches, including detailed examples and usage patterns.

  • Performance Evaluation - Benchmark results and performance metrics across different models and classification tasks.

License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.

Contributing

We welcome contributions! Feel free to open issues for bug reports or feature requests. If you'd like to contribute code directly, please see our contributing guidelines.

Author

Lucas Schwarz

Contact

luc.schwarz@posteo.de

Citation

If you use zeroshot-engine in your research, please cite it as follows:

@misc{zeroshotengine,
  author = {Lucas Schwarz},
  title = {zeroshot-engine: A scientific zero-shot text classification engine based on various LLM models},
  year = {2025},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/TheLucasSchwarz/zeroshotENGINE}}
}

PyPI Publishing

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zeroshot_engine-0.1.1.tar.gz (46.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zeroshot_engine-0.1.1-py3-none-any.whl (47.4 kB view details)

Uploaded Python 3

File details

Details for the file zeroshot_engine-0.1.1.tar.gz.

File metadata

  • Download URL: zeroshot_engine-0.1.1.tar.gz
  • Upload date:
  • Size: 46.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for zeroshot_engine-0.1.1.tar.gz
Algorithm Hash digest
SHA256 7d3e98f0bfae86f8c9a773edc23e895f6f96988bded1b9175e4116144d64a938
MD5 62016ddf14d3a1ad4b71287039e08dc0
BLAKE2b-256 7e73d89bc4e11865f9e811a0660e39699ec590e56660ab190761a520cd3fb018

See more details on using hashes here.

File details

Details for the file zeroshot_engine-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for zeroshot_engine-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 527b0b9e4f617faec40d7845856b5e8a99f16960b88b2e711ff03a82ef6f8aab
MD5 dfbcb8ef300c4962d8a8b7eff77e7fc0
BLAKE2b-256 e4e2ffe0b1b46c6cc03768c3c052542a8cf485e7042c738c4f39b81d24b77ee5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page