A zero-shot classification engine based on various LLM models
Project description
zeroshot-engine
A scientific zero-shot text classification engine based on various LLM models.
Description
This project provides a flexible framework for performing zero-shot classification using large language models and pandas. It allows you to classify text into categories without requiring explicit training data for those categories. All instructions to LLMs are provided by mere natural language prompts. The framework is designed to support a wide range of text classification tasks including multi-label, multi-class, and single-class classification scenarios.
Features
- Handles multi-label, multi-class, and single-class classification tasks.
- Option for incorporating few-shot learning through the flexible prompt engineering approach.
- Supports multiple LLM models (e.g., OpenAI, Ollama).
- Easy-to-use command-line interface for demo purposes.
- Customizable prompts.
- Integration with pandas for data handling.
Key Concepts
- Zero-Shot Learning: The ability of a model to make predictions on unseen classes or tasks without prior training on those specific classes or tasks. The system learns entirely through natural language instructions, eliminating the need for labeled examples or fine-tuning.
- Sequential Classification: A process where tasks are performed in a series of steps without strict dependencies (IDZSC approach).
- Hierarchical Classification: A structured approach that breaks down complex classification tasks into a series of simpler decisions following a predefined hierarchy with explicit dependencies (HDZSC approach).
- Multi-Prompting: The use of multiple different prompts for different tasks to elicit more comprehensive and reliable predictions from the model.
- Modular Prompt Design: While not automated in the current implementation, the modular prompt design with text blocks facilitates manual testing and refinement of prompts to improve classification accuracy.
Installation
pip install zeroshot-engine
Demo
zeroshot-engine demo
Usage
zeroshot-engine --help
Core Modules
Iterative Double Validated Zero-Shot Classification (IDZSC)
IDZSC is a core module that refines zero-shot classification results through an iterative process. It uses a double validation technique to ensure the robustness and accuracy of the classifications.
Hierarchical Double Validated Zero-Shot Classification (HDZSC)
HDZSC extends the zero-shot classification capabilities to hierarchical category structures. It leverages a double validation approach to maintain accuracy while navigating the complexities of hierarchical classification.
Planned Features
- Improved documentation and examples.
- Create prompting guidelines.
- Better integration and testing of validation metrics.
- For structured benchmarking and prompt engineering approach.
- Automated Logging System
- Add contribution guidelines.
- Support for more LLMs and APIs.
Documentation
For more detailed information about the framework and its implementation, please refer to the following documentation:
-
Overview of IDZSC and HDZSC - A comprehensive explanation of the Iterative and Hierarchical Double Zero-Shot Classification approaches, including detailed examples and usage patterns.
-
Performance Evaluation - Benchmark results and performance metrics across different models and classification tasks.
License
This project is licensed under the Apache 2.0 License - see the LICENSE file for details.
Contributing
We welcome contributions! Feel free to open issues for bug reports or feature requests. If you'd like to contribute code directly, please see our contributing guidelines.
Author
Lucas Schwarz
Contact
Citation
If you use zeroshot-engine in your research, please cite it as follows:
@misc{zeroshotengine,
author = {Lucas Schwarz},
title = {zeroshot-engine: A scientific zero-shot text classification engine based on various LLM models},
year = {2025},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/TheLucasSchwarz/zeroshotENGINE}}
}
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file zeroshot_engine-0.1.1.tar.gz.
File metadata
- Download URL: zeroshot_engine-0.1.1.tar.gz
- Upload date:
- Size: 46.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7d3e98f0bfae86f8c9a773edc23e895f6f96988bded1b9175e4116144d64a938
|
|
| MD5 |
62016ddf14d3a1ad4b71287039e08dc0
|
|
| BLAKE2b-256 |
7e73d89bc4e11865f9e811a0660e39699ec590e56660ab190761a520cd3fb018
|
File details
Details for the file zeroshot_engine-0.1.1-py3-none-any.whl.
File metadata
- Download URL: zeroshot_engine-0.1.1-py3-none-any.whl
- Upload date:
- Size: 47.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
527b0b9e4f617faec40d7845856b5e8a99f16960b88b2e711ff03a82ef6f8aab
|
|
| MD5 |
dfbcb8ef300c4962d8a8b7eff77e7fc0
|
|
| BLAKE2b-256 |
e4e2ffe0b1b46c6cc03768c3c052542a8cf485e7042c738c4f39b81d24b77ee5
|