General Purpose Tagger using GPT
Project description
GPTagger :label:
GPT Tagger is a powerful text tagger that makes use of the GPT model. This tool allows you to extract tags from a given text by leveraging the capabilities of GPT. However, using GPT as a text tagger is not a trivial task. GPT has the tendency to generate non-existing, fabricated, or processed text. To mitigate this issue, GPT Tagger provides a reliable method to ensure that the generated tags are derived from the input text while allowing GPT to process the extracted tags to some extent.
Below is an example of how GPT may respond wrong.
Text: "I earn $1000 this week!"
Prompt: "Extract how much he/she earns"
# Non-existent text
GPT: "one thousand dollar"
# Make-up text
GPT: "$999999"
# Processed text
GPT: "$1,000"
Introduction
These incorrect responses highlight the importance of using a reliable tag extraction tool like GPT Tagger. To do that, GPT Tagger follows a set of main steps:
- 🕵️♀️ Extraction: GPT Tagger sniffs out all possible tags by following your instructions to GPT.
- 🔍 Indexing: It spots the exact locations of these tags within the text.
- ✅ Validator: GPT Tagger's trusty validator steps in to validate if the extracted tags pass the rule-based and ML-based checks.
Check the example above how we extract ingredients from a yummy recipe text. 😋
Features ✨
Scale up GPT annotators and use switch between GPT3.5 and GPT4 easily
- Want to have a higher precision? try using GPT-4!
- Want to have a higher recall? Scale up the number of GPT annotators!
Instead of making a perfect prompt, use validator to shave off bad extractions
- Simple validator: Length, Regex...
- ML validator: GPT validator (Consider it like a chain of GPTs!)
How to Use 🚀
Setup
make install
export OPENAI_API_KEY=<your-key>
Pre-defined NER pipeline
The easiest way to dive into the GPT Tagger is through the Gradio web demo! Fire it up with a single command:
poetry run python GPTagger/app.py
If you prefer having the power of GPT Tagger at your fingertips in Python, check out this snippet:
from pathlib import Path
from GPTagger import *
cfg = NerConfig(
tag_name='date',
tag_regex=r"\d",
tag_max_len=128,
)
prompt = PromptTemplate.from_template(Path('<path-to-prompt>').read_text())
pipeline = NerPipeline.from_config(cfg)
doc = Path('<path-to-doc>').read_text()
tags = pipeline(doc, prompt)
Build Custom Pipelines 🎉
We believe that the possibilities of using GPT as a text tagger are endless! We invite you to contribute your own custom pipelines. Together, we'll unlock the true potential of GPT Tagger and make text tagging an better experience.
Leave a star if you find GPTagger is useful for your product or company! 🌟
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file gptagger-0.0.2.tar.gz
.
File metadata
- Download URL: gptagger-0.0.2.tar.gz
- Upload date:
- Size: 22.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.5.0 CPython/3.9.9 Darwin/21.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | eb4854585ac6f3c6f06aa57d135ce9d12d7e33b762e5808ecc39627167d935ed |
|
MD5 | 57642954cc1320ede2e4bfb04955eeab |
|
BLAKE2b-256 | d8edae733ae1c4c5bc07692ddac658a4c7b9acc10a0b147987cb825ad1d55c9f |
File details
Details for the file gptagger-0.0.2-py3-none-any.whl
.
File metadata
- Download URL: gptagger-0.0.2-py3-none-any.whl
- Upload date:
- Size: 25.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.5.0 CPython/3.9.9 Darwin/21.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 52eca80cab8de4f2e9ea27b8ea8605614c0c4f7129883dde8242dfa30447bf1d |
|
MD5 | 231ad1527f8078ef107bf821da7171cf |
|
BLAKE2b-256 | 52d6654cd8ade5c195653e85c9a6fbb792ad3fa711c50319258f65f4b4c959f9 |