Skip to main content

A confusion matrix package.

Project description

linting tests pypi rtd

💫 Dazed - A Confusion Matrix Package

Dazed is a little confusion matrix package designed to make your life easier. Its key features are:

  • support for lots of different data formats (sparse integers, sparse strings, one-hot arrays, dataframes)

  • support for multilabel data

  • ability to list most confused labels

  • ability to index sample information by confused label names

  • prints out nicely

Installation

For the basic installation:

$ pip install dazed

To include pandas dataframe support:

$ pip install dazed[pandas]

Basic Usage

To give you an idea of why you might want to use dazed, here is a toy example demonstrating the kind of investigation it was designed to help with. Note: I am using sparse string labels here but dazed’s interfaces can cope with integers, onehot encoded arrays and dataframes as well (refer to the API Reference for more information).

Imagine your building a machine learning model to catalogue a pet store’s inventory (primarily cats, dogs and fish). The owner has given you an image of each animal and you’ve trained your model and made some predictions. Your data looks like:

filenames = [
   "img0.jpg", "img1.jpg", "img2.jpg", "img3.jpg", "img4.jpg", "img5.jpg"
]
truth = ["cat", "dog", "cat", "dog", "fish", "dog"]
pred = ["cat", "dog", "dog", "cat", "fish", "cat"]

In order to understand how your model is doing, you make a quick confusion matrix:

from dazed import ConfusionMatrix

cm = ConfusionMatrix.from_sparse(truth, pred, info=filenames)
print(cm)
  | 0 1 2     index | label
---------     -------------
0 | 1 1 0         0 |   cat
1 | 2 1 0         1 |   dog
2 | 0 0 1         2 |  fish
---------     -------------

From the confusion matrix it looks like the model might be prone to thinking that dogs are actually cats. To double check:

cm.most_confused()
[('dog', 'cat', 2), ('cat', 'dog', 1)]

Ah yes, dogs were predicted to be cats twice and cats to be dogs once. To try and find out what the problem might be you decide that you should check the images. To get the appropiate images:

cm.label_pair_info("dog", "cat")
['img3.jpg', 'img5.jpg']

Upon investigating the images you notice that both dogs are white. You decide to go back through and label your images for animal colour.

truth = [
   ["cat", "white"],
   ["dog", "brown"],
   ["cat", "brown"],
   ["dog", "white"],
   ["fish", "orange"],
   ["dog", "white"]
]
pred = [
   ["cat", "white"],
   ["dog", "brown"],
   ["dog", "brown"],
   ["cat", "white"],
   ["fish", "orange"],
   ["cat", "white"]
]
cm = ConfusionMatrix.from_sparse(
   truth, pred, info=filenames, multilabel=True
)
print(cm)
  | 0 1 2 3 4     index |        label
-------------     --------------------
0 | 0 0 1 0 0         0 |   cat, brown
1 | 0 1 0 0 0         1 |   cat, white
2 | 0 0 1 0 0         2 |   dog, brown
3 | 0 2 0 0 0         3 |   dog, white
4 | 0 0 0 0 1         4 | fish, orange
-------------     --------------------

Hmm looks like all white dogs were miss classified as white cats.

cm.most_confused()
[('dog, white', 'cat, white', 2), ('cat, brown', 'dog, brown', 1)]

Ah yes looks like your model might be basing much of its prediction on animal colour, maybe time to go collect some more data.

To find out more about dazed take a look at the API Reference.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dazed-1.0.3.tar.gz (9.8 kB view details)

Uploaded Source

Built Distribution

dazed-1.0.3-py3-none-any.whl (8.2 kB view details)

Uploaded Python 3

File details

Details for the file dazed-1.0.3.tar.gz.

File metadata

  • Download URL: dazed-1.0.3.tar.gz
  • Upload date:
  • Size: 9.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.2.0 CPython/3.8.13 Linux/5.15.0-1017-azure

File hashes

Hashes for dazed-1.0.3.tar.gz
Algorithm Hash digest
SHA256 13fdf904b8fe41b6b5ac61fbad47a45f025c31fe3da655fc62602e46ff3772fa
MD5 050728378ca8d6c76e78a3f9a0d4b5f3
BLAKE2b-256 1a76bc1af767fbdf59f180e0ecd029d0ce55bbfd45984513b2c7bac6a90c7366

See more details on using hashes here.

File details

Details for the file dazed-1.0.3-py3-none-any.whl.

File metadata

  • Download URL: dazed-1.0.3-py3-none-any.whl
  • Upload date:
  • Size: 8.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.2.0 CPython/3.8.13 Linux/5.15.0-1017-azure

File hashes

Hashes for dazed-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 72e8f21fe9af9e14a5612b399582bde69e414ad8d6c99e469923ec23be75da12
MD5 e4bbf6760c12203fb916c4a3c070d234
BLAKE2b-256 9ee47d24a5bab49db742ad68affacdaac7d389530f206892ddb92005841f4010

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page