Skip to main content

A zero-shot classification engine based on various LLM models

Project description

zeroshot-engine

A scientific zero-shot text classification engine based on various LLM models.

Description

This project provides a flexible framework for performing zero-shot classification using large language models and pandas. It allows you to classify text into categories without requiring explicit training data for those categories. All instructions to LLMs are provided by mere natural language prompts. The framework is designed to support a wide range of text classification tasks including multi-label, multi-class, and single-class classification scenarios.

Features

  • Handles multi-label, multi-class, and single-class classification tasks.
  • Option for incorporating few-shot learning through the flexible prompt engineering approach.
  • Supports multiple LLM models (e.g., OpenAI, Ollama).
  • Easy-to-use command-line interface for demo purposes.
  • Customizable prompts.
  • Integration with pandas for data handling.

Key Concepts

  • Zero-Shot Learning: The ability of a model to make predictions on unseen classes or tasks without prior training on those specific classes or tasks. The system learns entirely through natural language instructions, eliminating the need for labeled examples or fine-tuning.
  • Sequential Classification: A process where tasks are performed in a series of steps without strict dependencies (IDZSC approach).
  • Hierarchical Classification: A structured approach that breaks down complex classification tasks into a series of simpler decisions following a predefined hierarchy with explicit dependencies (HDZSC approach).
  • Multi-Prompting: The use of multiple different prompts for different tasks to elicit more comprehensive and reliable predictions from the model.
  • Modular Prompt Design: While not automated in the current implementation, the modular prompt design with text blocks facilitates manual testing and refinement of prompts to improve classification accuracy.

Installation

pip install zeroshot-engine

Demo

zeroshot-engine demo

Usage

zeroshot-engine --help

Core Modules

Iterative Double Validated Zero-Shot Classification (IDZSC)

IDZSC is a core module that refines zero-shot classification results through an iterative process. It uses a double validation technique to ensure the robustness and accuracy of the classifications.

Hierarchical Double Validated Zero-Shot Classification (HDZSC)

HDZSC extends the zero-shot classification capabilities to hierarchical category structures. It leverages a double validation approach to maintain accuracy while navigating the complexities of hierarchical classification.

Planned Features

  • Improved documentation and examples.
  • Create prompting guidelines.
  • Better integration and testing of validation metrics.
  • For structured benchmarking and prompt engineering approach.
  • Automated Logging System
  • Add contribution guidelines.
  • Support for more LLMs and APIs.

Documentation

For more detailed information about the framework and its implementation, please refer to the following documentation:

  • Overview of IDZSC and HDZSC - A comprehensive explanation of the Iterative and Hierarchical Double Zero-Shot Classification approaches, including detailed examples and usage patterns.

  • Performance Evaluation - Benchmark results and performance metrics across different models and classification tasks.

License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.

Contributing

We welcome contributions! Feel free to open issues for bug reports or feature requests. If you'd like to contribute code directly, please see our contributing guidelines.

Author

Lucas Schwarz

Contact

luc.schwarz@posteo.de

Citation

If you use zeroshot-engine in your research, please cite it as follows:

@misc{zeroshotengine,
  author = {Lucas Schwarz},
  title = {zeroshot-engine: A scientific zero-shot text classification engine based on various LLM models},
  year = {2025},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/TheLucasSchwarz/zeroshotENGINE}}
}

PyPI Publishing

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

zeroshot_engine-0.1.0.tar.gz (46.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

zeroshot_engine-0.1.0-py3-none-any.whl (47.5 kB view details)

Uploaded Python 3

File details

Details for the file zeroshot_engine-0.1.0.tar.gz.

File metadata

  • Download URL: zeroshot_engine-0.1.0.tar.gz
  • Upload date:
  • Size: 46.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for zeroshot_engine-0.1.0.tar.gz
Algorithm Hash digest
SHA256 140e87e5a37aed3bd31403eebd03f25d4da696286d204cfe3ffacd3afabae9b3
MD5 38e19ba861daba81b34ac6efdfbad63b
BLAKE2b-256 e9e8a37d38fa205172e959583d3d3959ce2748db2d703a20d0ca0b9eb0fb67a1

See more details on using hashes here.

File details

Details for the file zeroshot_engine-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for zeroshot_engine-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c0b10bcf2f0f4b24fd11ae65c77484dd8540f93d7fa3845a81a0fb9f83554c5a
MD5 c2e5f702f8fb1080b873121cc8160bef
BLAKE2b-256 29b95fef4314953555e5024d2600e85bb591e57b4f859d83d80193ab08d3d0f5

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page