Skip to main content

Extended Decision Fusion

Project description

eFusor: Extended Decision Fusion

Decision Fusion is a combination of the decisions of multiple classifiers into a common decision; i.e. a classifier ensemble operation. The fusion of prediction vectors from multiple classifiers to a single prediction vector, out of which the decision is taken via argmax.

eFusor library provides an interface to common Decision Fusion methods; such as Majority Voting and less known Tournament-style Borda Counting, as well as basic operations like max and average.

The expected input for fusion is either tensor or matrix.

  • Vector = list[float] -- ordered list of predictions scores from a model for a query
  • Matrix = list[Vector] -- ordered list of vectors; prediction scores for a query from a number of models
  • Tensor = list[Matrix] -- ordered list of matrices; batch of predictions for several documents

Motivation

eFusor was developed specifically to address the scenario where predictors (classifiers) may have different label spaces. Consequently, the library makes distinction between classes predicted with a low score (0.0) and not predicted classes (nan).

Fusion Methods

Basic Fusion Methods

Since decision fusion of prediction vectors boils down to the reduction of a matrix to a vector column-wise, i.e. reducing a column vector to a scalar; any mathematical operation on a vector of numbers applies.

In Kittler, Hatef, Duin, and Matas (1998) "On Combining Classifiers". IEEE Transactions on Pattern Analysis and Machine Intelligence, 20-3. The authors use the functions below as basic classifier combination schemes.

method notes
average mean value of a vector; requires well calibrated scores.
product product rule and product rule issues!
sum approximation of product; assumes posteriors to be not far from priors!
max approximation of sum
min bound version of product
median approximation of sum; robust version of average

Voting Fusion Methods

The basic fusion methods operate with the classifier prediction scores, a real number vectors. The problem could be reduced to operate on one-hot vectors; in a way first taking per-classifier decision, rather than postponing it. Combination of decision vectors is commonly done as a majority rule.

scikit-learn provides VotingClassifier as an ensemble method and makes distinction between Hard Voting and Soft Voting. While Hard Voting is the Majority Voting; Soft Voting is nothing other than average (or weighted arithmetic mean, if weights are provided).

Rank-based Voting Methods

Rank-based voting, specifically tournament-style borda count, is a decision technique commonly used is election decisions. While majority voting transforms prediction scores to a one-hot vector; rank-based voting transforms it to an integer vector of ranks (the higher the score the lower the rank).

The benefit is that we still consider all predictions for fusion and do not require well calibrated scores.

Weighted Fusion

In certain scenarios (e.g. fusion of decisions of rule-based and machine learning predictors), it is desired to weigh different classifiers differently. Weighted Average is a commonly used scheme.

(While Borda Count also allows to weigh different classifiers differently, it is not implemented).

Usage:

The primary decision fusion function is fuse.

from efusor import fuse

methods = [
    "max", "min", "sum", "product", "median", "average", 
    "hard_voting", "soft_voting", 
    "borda"
]

matrix = [[0.25, 0.60, 0.15], [0.00, 0.80, 0.00]]

# unweighted results
for method in methods:
    result = fuse(matrix, method=method)
    print(f"{method}: {result}")
max:         [0.25, 0.8, 0.15]
min:         [0.0, 0.6, 0.0]
sum:         [0.0, 1.07, 0.0]
product:     [0.0, 0.16, 0.0]
median:      [0.125, 0.7, 0.075]
average:     [0.125, 0.7, 0.075]
hard_voting: [0, 2, 0]
soft_voting: [0.125, 0.7, 0.075]
borda:       [1.0, 4.0, 0.0]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

efusor-0.1.2.tar.gz (23.9 kB view details)

Uploaded Source

File details

Details for the file efusor-0.1.2.tar.gz.

File metadata

  • Download URL: efusor-0.1.2.tar.gz
  • Upload date:
  • Size: 23.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.10.4

File hashes

Hashes for efusor-0.1.2.tar.gz
Algorithm Hash digest
SHA256 f8e83c5a068cd9eae7453850f9224d592e7eabc28dfed747bbba3bd6ea1eaf85
MD5 0e711c82e73d36a456a45268720d9a55
BLAKE2b-256 10c94924c7f3ce80a4b8ee4457f3e160a8e73e0eb6ae807be72fb88090ff13be

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page