Skip to main content

No project description provided

Project description

Text Labeling AI Wizard (tailwiz)

tailwiz is an AI-powered tool for labeling text. It has three main capabilties: classifying text (tailwiz.classify), parsing text given context and prompts (tailwiz.parse), and generating text given prompts (tailwiz.generate).

Quickstart

Install tailwiz by copying and pasting the following into command line:

python -m pip install tailwiz

Then run the following in a Python environment for a quick example of text classification:

import tailwiz
import pandas as pd

prelabeled_text = pd.DataFrame(
    [
        ['Love you to the moon', 'nice'],
        ['I hate you', 'mean'],
        ['Have a great day', 'nice'],
    ],
    columns=['text', 'label'],
)
text_to_label = pd.DataFrame(
    ['You are the best!', 'You make me sick'],
    columns=['text'],
)
results = tailwiz.classify(
    text_to_label=text_to_label,
    prelabeled_text=prelabeled_text,
)
print(results)

Installation

Install tailwiz through pip:

python -m pip install tailwiz

Usage

In this section, we outline the three main functions of tailwiz and provide examples.

tailwiz.classify(text_to_label, prelabeled_text=None, output_metrics=False)

Given text, classify the text.

Parameters:

  • text_to_label : pandas.DataFrame. Data structure containing text to classify. Must contain a string column named text.
  • prelabeled_text : pandas.DataFrame, default None. Pre-labeled text to enhance the performance of the classification task. Must contain a string column for the classified text named text and a column for the labels named label.
  • output_metrics : bool, default False. Whether to output performance_estimate together with results in a tuple.

Returns:

  • results : pandas.DataFrame. A copy of text_to_label with a new column, label_from_tailwiz, containing classification results.
  • performance_estimate : Dict[str, float]. Dictionary of metric name to metric value mappings. Included together with results in a tuple if output_metrics is True. Uses prelabeled_text to give an estimate of the accuracy of the classification. One vs. all metrics are given for multiclass classification.

Example:

import tailwiz
import pandas as pd

prelabeled_text = pd.DataFrame(
    [
        ['Love you to the moon', 'nice'],
        ['I hate you', 'mean'],
        ['Have a great day', 'nice'],
    ],
    columns=['text', 'label'],
)
text_to_label = pd.DataFrame(
    ['You are the best!', 'You make me sick'],
    columns=['text'],
)
results = tailwiz.classify(
    text_to_label=text_to_label,
    prelabeled_text=prelabeled_text,
)
print(results)

tailwiz.parse(text_to_label, prelabeled_text=None, output_metrics=False)

Given a prompt and a context, parse the answer from the context.

Parameters:

  • text_to_label : pandas.DataFrame. Data containing prompts and contexts from which answers will be parsed. Must contain a string column for the context named context and a string column for the prompt named prompt.
  • prelabeled_text : pandas.DataFrame, default None. Pre-labeled text to enhance the performance of the parsing task. Must contain a string column for the context named context, a string column for the prompt named prompt, and a string column for the label named label.
  • output_metrics : bool, default False. Whether to output performance_estimate together with results in a tuple.

Returns:

  • results : pandas.DataFrame. A copy of text_to_label with a new column, label_from_tailwiz, containing parsed results.
  • performance_estimate : Dict[str, float]. Dictionary of metric name to metric value mappings. Included together with results in a tuple if output_metrics is True. Uses prelabeled_text to give an estimate of the accuracy of the parsing job.

Example:

import tailwiz
import pandas as pd

prelabeled_text = pd.DataFrame(
    [
        ['Extract the number.', 'Noon is twelve oclock', 'twelve'],
        ['Extract the number.', '10 jumping jacks', '10'],
        ['Extract the number.', 'I have 3 eggs', '3'],
    ],
    columns=['prompt', 'context', 'label'],
)
text_to_label = pd.DataFrame(
    [['Extract the number.', 'Figure 8']],
    columns=['prompt', 'context'],
)
results = tailwiz.parse(
    text_to_label=text_to_label,
    prelabeled_text=prelabeled_text,
)
print(results)

tailwiz.generate(text_to_label, prelabeled_text=None, output_metrics=False)

Given a prompt, generate an answer.

Parameters:

  • text_to_label : pandas.DataFrame. Data structure containing prompts for which answers will be generated. Must contain a string column for the prompt named prompt.
  • prelabeled_text : pandas.DataFrame, default None. Pre-labeled text to enhance the performance of the text generation task. Must contain a string column for the prompt named prompt and a string column for the label named label.
  • output_metrics : bool, default False. Whether to output performance_estimate together with results in a tuple.

Returns:

  • results : pandas.DataFrame. A copy of text_to_label with a new column, label_from_tailwiz, containing generated results.
  • performance_estimate : Dict[str, float]. Dictionary of metric name to metric value mappings. Included together with results in a tuple if output_metrics is True. Uses prelabeled_text to give an estimate of the accuracy of the text generation job.

Example:

import tailwiz
import pandas as pd

prelabeled_text = pd.DataFrame(
    [
        ['Is this sentence Happy or Sad? I love puppies!', 'Happy'],
        ['Is this sentence Happy or Sad? I do not like you at all.', 'Sad'],
    ],
    columns=['prompt', 'label']
)
text_to_label = pd.DataFrame(
    ['Is this sentence Happy or Sad? I am crying my eyes out.'],
    columns=['prompt']
)
results = tailwiz.generate(
    text_to_label=text_to_label,
    prelabeled_text=prelabeled_text,
)

Templates (Notebooks)

Use these Jupyter Notebook examples as templates to help load your data and run any of the three tailwiz functions:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tailwiz-0.0.9.tar.gz (9.5 kB view details)

Uploaded Source

Built Distribution

tailwiz-0.0.9-py3-none-any.whl (11.4 kB view details)

Uploaded Python 3

File details

Details for the file tailwiz-0.0.9.tar.gz.

File metadata

  • Download URL: tailwiz-0.0.9.tar.gz
  • Upload date:
  • Size: 9.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.7.16

File hashes

Hashes for tailwiz-0.0.9.tar.gz
Algorithm Hash digest
SHA256 90774ad23df844a631b240912d2efdd89808d8346242e7e87ca6d894ee052314
MD5 a7bc4201b7e23c93447c2f8973ff1c99
BLAKE2b-256 06ddd25827e45a9c717878ed3f28d86154b413b944f7467ca259e67975b6210b

See more details on using hashes here.

File details

Details for the file tailwiz-0.0.9-py3-none-any.whl.

File metadata

  • Download URL: tailwiz-0.0.9-py3-none-any.whl
  • Upload date:
  • Size: 11.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.7.16

File hashes

Hashes for tailwiz-0.0.9-py3-none-any.whl
Algorithm Hash digest
SHA256 2cd50b2322d664283d0e709db792284cb75d3c03bf4a519f4f73053172c9cce6
MD5 b03ad784a648bdfb530985b36f0bb226
BLAKE2b-256 ccd53816c7d2063255034f14a0831ba45dd119bae8f2f8fdef3b31e6d9c5bbba

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page