Skip to main content

Metrics and visualizations for evaluating chatbot's AI utilization.

Project description

TakeAiEvaluation

TakeAiEvaluation is a tool to provide metrics and visualizations for evaluating a chatbot's AI utilization. This currently addresses two types of evaluation: Knowledge Base Quality and Message Base Information.

Installation

The take_ai_evaluation package can be installed from PyPI:

pip install take_ai_evaluation

Usage

As input, either a pandas.DataFrame or a CSV file path can be used.

  1. Matrix all vs all
import matplotlib.pyplot as plt
from take_ai_evaluation import AiEvaluation

ai_evaluation = AiEvaluation(analysed_base='knowledge-base.csv', 
                             sentence_col='id', 
                             intent_col='intent', 
                             predict_col='predicted')

ai_evaluation.get_all_vs_all_confusion_matrix(title='All vs All')

plt.show()
  1. Matrix one vs all
import matplotlib.pyplot as plt
from take_ai_evaluation import AiEvaluation

ai_evaluation = AiEvaluation(analysed_base='knowledge-base.csv', 
                             sentence_col='id', 
                             intent_col='intent', 
                             predict_col='predicted')

ai_evaluation.get_one_vs_all_confusion_matrix(intent='Intent', title='All vs All')

plt.show()
  1. Best intent
  • Just the values for the default metric, which is 'accuracy'
import matplotlib.pyplot as plt
from take_ai_evaluation import AiEvaluation

ai_evaluation = AiEvaluation(analysed_base='knowledge-base.csv', 
                             sentence_col='id', 
                             intent_col='intent', 
                             predict_col='predicted')

ai_evaluation.get_best_intent()

plt.show()
  • Just the values for 'recall' metric
import matplotlib.pyplot as plt
from take_ai_evaluation import AiEvaluation

ai_evaluation = AiEvaluation(analysed_base='knowledge-base.csv', 
                             sentence_col='id', 
                             intent_col='intent', 
                             predict_col='predicted')

ai_evaluation.get_best_intent(metric='recall')

plt.show()
  • As graph
import matplotlib.pyplot as plt
from take_ai_evaluation import AiEvaluation

ai_evaluation = AiEvaluation(analysed_base='knowledge-base.csv', 
                             sentence_col='id', 
                             intent_col='intent', 
                             predict_col='predicted')

ai_evaluation.get_best_intent(as_graph=True)

plt.show()
  1. Worst intent
  • Just the values for the default metric, which is 'accuracy'
import matplotlib.pyplot as plt
from take_ai_evaluation import AiEvaluation

ai_evaluation = AiEvaluation(analysed_base='knowledge-base.csv', 
                             sentence_col='id', 
                             intent_col='intent', 
                             predict_col='predicted')

ai_evaluation.get_worst_intent()

plt.show()
  • Just the values for 'recall' metric
import matplotlib.pyplot as plt
from take_ai_evaluation import AiEvaluation

ai_evaluation = AiEvaluation(analysed_base='knowledge-base.csv', 
                             sentence_col='id', 
                             intent_col='intent', 
                             predict_col='predicted')

ai_evaluation.get_worst_intent(metric='recall')

plt.show()
  • As graph
import matplotlib.pyplot as plt
from take_ai_evaluation import AiEvaluation

ai_evaluation = AiEvaluation(analysed_base='knowledge-base.csv', 
                             sentence_col='id', 
                             intent_col='intent', 
                             predict_col='predicted')

ai_evaluation.get_worst_intent(as_graph=True)

plt.show()
  1. Classification Report
from take_ai_evaluation import AiEvaluation

ai_evaluation = AiEvaluation(analysed_base='knowledge-base.csv', 
                             sentence_col='id', 
                             intent_col='intent', 
                             predict_col='predicted')

ai_evaluation.get_classification_report()
  • As pandas DataFrame
from take_ai_evaluation import AiEvaluation

ai_evaluation = AiEvaluation(analysed_base='knowledge-base.csv', 
                             sentence_col='id', 
                             intent_col='intent', 
                             predict_col='predicted')

ai_evaluation.get_classification_report(as_dataframe=True)

Author

Take Blip Data&Analytics Research (ROps)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

take_ai_evaluation-0.2.3.tar.gz (11.0 kB view details)

Uploaded Source

Built Distribution

take_ai_evaluation-0.2.3-py3-none-any.whl (15.5 kB view details)

Uploaded Python 3

File details

Details for the file take_ai_evaluation-0.2.3.tar.gz.

File metadata

  • Download URL: take_ai_evaluation-0.2.3.tar.gz
  • Upload date:
  • Size: 11.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.9.5

File hashes

Hashes for take_ai_evaluation-0.2.3.tar.gz
Algorithm Hash digest
SHA256 cb5d7879e1924829b316c9aed782cb5f9ba883cff1774590c45baa6e895f4e09
MD5 c38c2769884bc9a8beaafaded85d79fa
BLAKE2b-256 7e94665c7ff22601d0cb74f998284f15f1dd922bc8c58e70b118a6662e33bc80

See more details on using hashes here.

File details

Details for the file take_ai_evaluation-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: take_ai_evaluation-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 15.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.9.5

File hashes

Hashes for take_ai_evaluation-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 c337046818fd80f55a56ef33058c1481d0659bc4315b331cada2743e978dd748
MD5 15005dfdd2a51228ba74f9bde03e85b0
BLAKE2b-256 cbadc13dc5c836b46cc7b81c258190ffda1fbff351960658b2575da467ae6f54

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page