No project description provided
Project description
Text Labeling AI Wizard (tailwiz)
tailwiz
is an AI-powered tool for labeling text. It has three main capabilties: classifying text (tailwiz.classify
), parsing text given context and prompts (tailwiz.parse
), and generating text given prompts (tailwiz.generate
).
Quickstart
Install tailwiz
by entering into command line:
python -m pip install tailwiz
Then run the following in a Python environment for a quick example of text classification:
import tailwiz
import pandas as pd
# Create a pandas DataFrame of labeled text. Notice the 'label'
# column contains 'mean' or 'nice' as labels for each text.
labeled_examples = pd.DataFrame(
[
['You make me vomit', 'mean'],
['Love you lots', 'nice'],
['You are the best', 'nice'],
],
columns=['text', 'label'],
)
# Create a pandas DataFrame of text to be classified by tailwiz.
# Notice that this DataFrame does not have a 'label' column.
# The labels here will be created by tailwiz.
text = pd.DataFrame(
['Have a great day', 'I hate you'],
columns=['text'],
)
# Classify text using labeled_examples as reference data.
results = tailwiz.classify(
text,
labeled_examples=labeled_examples,
)
# Note how the results are a copy of text with a new column
# populated with AI-generated labels.
print(results)
Installation
Install tailwiz
through pip
:
python -m pip install tailwiz
Usage
In this section, we outline the three main functions of tailwiz
and provide examples.
tailwiz.classify(to_classify, labeled_examples=None, output_metrics=False)
Given text, classify the text.
Parameters:
to_classify
: pandas.DataFrame with a column named'text'
(str
). Text to be classified.labeled_examples
: pandas.DataFrame with columns named'text'
(str
) and'label'
(str
,int
), default None. Labeled examples to enhance the performance of the classification task. The classified text is in the'text'
column and the text's labels are in the'label'
column.output_metrics
: bool, default False. Whether to outputperformance_estimate
together with results in a tuple.
Returns:
results
: pandas.DataFrame. A copy ofto_classify
with a new column,'label_from_tailwiz'
, containing classification results.performance_estimate
: Dict[str, float]. Dictionary of metric name to metric value mappings. Included together with results in a tuple ifoutput_metrics
is True. Uses labeled_examples to give an estimate of the accuracy of the classification. One vs. all metrics are given for multiclass classification.
Example:
import tailwiz
import pandas as pd
labeled_examples = pd.DataFrame(
[
['You make me vomit', 'mean'],
['Love you lots', 'nice'],
['You are the best', 'nice'],
],
columns=['text', 'label'],
)
text = pd.DataFrame(
['Have a great day', 'I hate you'],
columns=['text'],
)
results = tailwiz.classify(
text,
labeled_examples=labeled_examples,
)
print(results)
tailwiz.parse(to_parse, labeled_examples=None, output_metrics=False)
Given a prompt and a context, parse the answer from the context.
Parameters:
to_parse
: pandas.DataFrame with columns named'context'
(str
) and'prompt'
(str
). Labels will be parsed directly from contexts in'context'
according to the prompts in'prompt'
.labeled_examples
: pandas.DataFrame with columns named'context'
(str
),'prompt'
(str
), and'label'
(str
), default None. Labeled examples to enhance the performance of the parsing task. The labels in'label'
must be extracted exactly from the contexts in'context'
(as whole words) according to the prompts in'prompt'
.output_metrics
: bool, default False. Whether to outputperformance_estimate
together with results in a tuple.
Returns:
results
: pandas.DataFrame. A copy ofto_parse
with a new column,'label_from_tailwiz'
, containing parsed results.performance_estimate
: Dict[str, float]. Dictionary of metric name to metric value mappings. Included together with results in a tuple ifoutput_metrics
is True. Uses labeled_examples to give an estimate of the accuracy of the parsing job.
Example:
import tailwiz
import pandas as pd
labeled_examples = pd.DataFrame(
[
['Extract the money.', 'He owed me $100', '$100'],
['Extract the money.', '¥5000 bills are common', '¥5000'],
['Extract the money.', 'Eggs rose to €5 this week', '€5'],
],
columns=['prompt', 'context', 'label'],
)
text = pd.DataFrame(
[['Extract the money.', 'Try to save at least £10']],
columns=['prompt', 'context'],
)
results = tailwiz.parse(
text,
labeled_examples=labeled_examples,
)
print(results)
tailwiz.generate(to_generate, labeled_examples=None, output_metrics=False)
Given a prompt, generate an answer.
Parameters:
to_generate
: pandas.DataFrame with a column named'prompt'
(str
). Prompts according to which labels will generated.labeled_examples
: pandas.DataFrame with columns named'prompt'
(str
) and'label'
(str
), default None. Labeled examples to enhance the performance of the parsing task. The labels in'label'
should be responses to the prompts in'prompt'
.output_metrics
: bool, default False. Whether to outputperformance_estimate
together with results in a tuple.
Returns:
results
: pandas.DataFrame. A copy ofto_generate
with a new column,'label_from_tailwiz'
, containing generated results.performance_estimate
: Dict[str, float]. Dictionary of metric name to metric value mappings. Included together with results in a tuple ifoutput_metrics
is True. Uses labeled_examples to give an estimate of the accuracy of the text generation job.
Example:
import tailwiz
import pandas as pd
labeled_examples = pd.DataFrame(
[
['Label this sentence as "positive" or "negative": I love puppies!', 'positive'],
['Label this sentence as "positive" or "negative": I do not like you at all.', 'negative'],
['Label this sentence as "positive" or "negative": Love you lots.', 'positive'],
],
columns=['prompt', 'label']
)
text = pd.DataFrame(
['Label this sentence as "positive" or "negative": I am crying my eyes out.'],
columns=['prompt']
)
results = tailwiz.generate(
text,
labeled_examples=labeled_examples,
)
print(results)
Templates (Notebooks)
Use these Jupyter Notebook examples as templates to help load your data and run any of the three tailwiz
functions:
- For an example of
tailwiz.classify
, seeexamples/classify.ipynb
- For an example of
tailwiz.parse
, seeexamples/parse.ipynb
- For an example of
tailwiz.generate
, seeexamples/generate.ipynb
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file tailwiz-0.0.11.tar.gz
.
File metadata
- Download URL: tailwiz-0.0.11.tar.gz
- Upload date:
- Size: 10.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.7.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 58e594f566a8e9d09cf0a7de5574e4857dff87af98db2ea218284f0828d0547a |
|
MD5 | 3db535cdd3da2096e35ebfd9d2d84817 |
|
BLAKE2b-256 | 6ca99d8181905db9d3a9066a1345530d1805d47bd8732a14d271c7677dde02d1 |
File details
Details for the file tailwiz-0.0.11-py3-none-any.whl
.
File metadata
- Download URL: tailwiz-0.0.11-py3-none-any.whl
- Upload date:
- Size: 11.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.7.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 86102cc1f205fa1b715b94661d2c0c941ce551510b1e3497d6496142f9bae785 |
|
MD5 | 94ebeb134beff41a9932d92d4eb7f0bb |
|
BLAKE2b-256 | 5e4bffb0f4896ee932852fdc70d8941629ccd3d16a5d1ea8f223ec89e0348dc7 |