Skip to main content

CLANA is a toolkit for classifier analysis.

Project description

DOI PyPI version Python Support Documentation Status Build Status Coverage Status

clana

clana is a toolkit for classifier analysis. One key contribution of clana is Confusion Matrix Ordering (CMO) as explained in chapter 5 of Analysis and Optimization of Convolutional Neural Network Architectures. It is a technique that can be applied to any multi-class classifier and helps to understand which groups of classes are most similar.

Installation

The recommended way to install clana is:

$ pip install clana --user

If you want the latest version:

$ git clone https://github.com/MartinThoma/clana.git; cd clana
$ pip install -e . --user

Usage

clana --help
Usage: clana [OPTIONS] COMMAND [ARGS]...

Options:
  --version  Show the version and exit.
  --help     Show this message and exit.

Commands:
  distribution   Get the distribution of classes in a dataset.
  get-cm         Calculate the confusion matrix (CSV inputs).
  get-cm-simple  Calculate the confusion matrix (one label per...
  visualize      Optimize confusion matrix.

The visualize command gives you images like this:

Confusion Matrix after Confusion Matrix Ordering of the WiLI-2018 dataset

MNIST example

$ cd docs/
$ python mnist_example.py  # creates `train-pred.csv` and `test-pred.csv`
$ clana get-cm --gt gt-train.csv  --predictions train-pred.csv --n 10
2019-09-14 09:47:30,655 - root - INFO - cm was written to 'cm.json'
$ clana visualize --cm cm.json --zero_diagonal
Score: 13475
2019-09-14 09:49:41,593 - root - INFO - n=10
2019-09-14 09:49:41,593 - root - INFO - ## Starting Score: 13475.00
2019-09-14 09:49:41,594 - root - INFO - Current: 13060.00 (best: 13060.00, hot_prob_thresh=100.0000%, step=0, swap=False)
[...]
2019-09-14 09:49:41,606 - root - INFO - Current: 9339.00 (best: 9339.00, hot_prob_thresh=100.0000%, step=238, swap=False)
Score: 9339
Perm: [0, 6, 5, 8, 3, 2, 1, 7, 9, 4]
2019-09-14 09:49:41,639 - root - INFO - Classes: [0, 6, 5, 8, 3, 2, 1, 7, 9, 4]
Accuracy: 93.99%
2019-09-14 09:49:41,725 - root - INFO - Save figure at '/home/moose/confusion_matrix.tmp.pdf'
2019-09-14 09:49:41,876 - root - INFO - Found threshold for local connection: 398
2019-09-14 09:49:41,876 - root - INFO - Found 9 clusters
2019-09-14 09:49:41,877 - root - INFO - silhouette_score=-0.012313948323292875
    1: [0]
    1: [6]
    1: [5]
    1: [8]
    1: [3]
    1: [2]
    1: [1]
    2: [7, 9]
    1: [4]

This gives

Label Manipulation

Prepare a labels.csv which has to have a header row:

$ clana visualize --cm cm.json --zero_diagonal --labels mnist/labels.csv

Data distribution

$ clana distribution --gt gt.csv --labels labels.csv [--out out/] [--long]

prints one line per label, e.g.

60% cat (56789 elements)
20% dog (12345 elements)
 5% mouse (1337 elements)
 1% tux (314 elements)

If --out is specified, it creates a horizontal bar chart. The first bar is the most common class, the second bar is the second most common class, ...

It uses the short labels, except --long is added to the command.

Metrics

$ clana metrics --gt gt.csv --preds preds.csv

gives the following metrics by

  • Line 1: Accuracy
  • Line 2: Precision
  • Line 3: Recall
  • Line 4: F1-Score
  • Line 5: Mean accuracy

Visualizations

See visualizations

Development

Check tests with tox.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

clana-0.3.3.tar.gz (17.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

clana-0.3.3-py3-none-any.whl (18.8 kB view details)

Uploaded Python 3

File details

Details for the file clana-0.3.3.tar.gz.

File metadata

  • Download URL: clana-0.3.3.tar.gz
  • Upload date:
  • Size: 17.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.8

File hashes

Hashes for clana-0.3.3.tar.gz
Algorithm Hash digest
SHA256 0cfe2fec985e97ef85d4ac24bab717ebea0e8562a8628d535f15d37360b99816
MD5 4c62283a81155b0de03862a76d2fdff6
BLAKE2b-256 fb6ebe724069a7aa2bf8c4e563475c257c433e04a8a770f287dbf8ea8d4ccb60

See more details on using hashes here.

File details

Details for the file clana-0.3.3-py3-none-any.whl.

File metadata

  • Download URL: clana-0.3.3-py3-none-any.whl
  • Upload date:
  • Size: 18.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.22.0 setuptools/41.0.1 requests-toolbelt/0.8.0 tqdm/4.28.1 CPython/3.6.8

File hashes

Hashes for clana-0.3.3-py3-none-any.whl
Algorithm Hash digest
SHA256 a41614883d11f09c7bc1152f7f4f98a2064fed9b7743fc7fa3529e178ebcc966
MD5 7d611f37be0fd596ebc0bf7fe293d2cc
BLAKE2b-256 2b1de63ae5cac04c3608b4b5bb1d8338746a4f7ab0284d7a120f1a38a647f48b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page