DecisionTree

A Python module for decision-tree based classification of multidimensional data

These details have not been verified by PyPI

Project links

Project description

Version 3.0.1 is a minor revision that smooths out the documentation at a couple of important places. I have also fixed the typos that I discovered after the previous version was released.

Version 3.0 adds bagging capability to the decision tree module. If you have a large enough training dataset, you can now construct multiple decision trees and have the final classification be based on a majority vote from all the trees. This can average out the noise in the classification process.

Version 2.3.3 makes it clear in the documentation that the data entries in CSV training files do NOT need to be double quoted.

Version 2.3.2 incorporates enhancements to the introspection capabilities of the module.

Version 2.3 gives the module a new capability — ability to introspect about the classification decisions at the nodes of the decision tree.

With regard to the purpose of the module, assuming you have placed your training data in a CSV file, all you have to do is to supply the name of the file to this module and it does the rest for you without much effort on your part for classifying a new data sample. A decision tree classifier consists of feature tests that are arranged in the form of a tree. The feature test associated with the root node is one that can be expected to maximally disambiguate the different possible class labels for a new data record. From the root node hangs a child node for each possible outcome of the feature test at the root. This maximal class-label disambiguation rule is applied at the child nodes recursively until you reach the leaf nodes. A leaf node may correspond either to the maximum depth desired for the decision tree or to the case when there is nothing further to gain by a feature test at the node.

Typical usage syntax:

training_datafile = "stage3cancer.csv"
dt = DecisionTree.DecisionTree(
                training_datafile = training_datafile,
                csv_class_column_index = 2,
                csv_columns_for_features = [3,4,5,6,7,8],
                entropy_threshold = 0.01,
                max_depth_desired = 8,
                symbolic_to_numeric_cardinality_threshold = 10,
     )

  dt.get_training_data()
  dt.calculate_first_order_probabilities()
  dt.calculate_class_priors()
  dt.show_training_data()
  root_node = dt.construct_decision_tree_classifier()
  root_node.display_decision_tree("   ")

  test_sample  = ['g2 = 4.2',
                  'grade = 2.3',
                  'gleason = 4',
                  'eet = 1.7',
                  'age = 55.0',
                  'ploidy = diploid']
  classification = dt.classify(root_node, test_sample)
  print "Classification: ", classification

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

3.4.3

May 14, 2016

3.4.2

May 2, 2016

3.4.1

May 1, 2016

3.4.0

Apr 4, 2016

3.3.2

Feb 13, 2016

3.3.1

Jan 31, 2016

3.3.0

Jan 26, 2016

3.2.4

Nov 23, 2015

3.2.3

Oct 26, 2015

3.2.2

Oct 25, 2015

3.2.1

Jun 14, 2015

3.2.0

Jun 10, 2015

This version

3.0.1

May 28, 2015

3.0

May 18, 2015

2.3.4

May 5, 2015

2.3.3

Mar 27, 2015

2.3.2

Mar 22, 2015

2.3.1

Mar 17, 2015

2.3

Mar 16, 2015

2.2.6

Mar 11, 2015

2.2.5

Nov 25, 2014

2.2.4

Jun 18, 2014

2.2.3

Jun 13, 2014

2.2.2

May 3, 2014

2.2.1

Sep 5, 2013

2.2

Sep 2, 2013

2.1

Aug 17, 2013

2.0

Jun 19, 2013

1.7.1

May 22, 2013

1.7

Jul 29, 2012

1.6.1

Jun 22, 2012

1.6

Jun 20, 2012

1.5

May 16, 2011

1.0

May 16, 2011

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

DecisionTree-3.0.1.tar.gz (247.2 kB view details)

Uploaded Oct 28, 2015 Source

File details

Details for the file DecisionTree-3.0.1.tar.gz.

File metadata

Download URL: DecisionTree-3.0.1.tar.gz
Upload date: Oct 28, 2015
Size: 247.2 kB
Tags: Source
Uploaded using Trusted Publishing? No

File hashes

Hashes for DecisionTree-3.0.1.tar.gz
Algorithm	Hash digest
SHA256	`1a15d87ce069caa81e9c2f287fd71ed4a59075c04980ff754bade4e931c9bf1d`
MD5	`118243289b3f669dbaddc4fa54232032`
BLAKE2b-256	`6d40d0ad33b02dbaf844c278ffec7fdaad6bedab7bab910e066d5201ec8c300c`

See more details on using hashes here.

DecisionTree 3.0.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

File details

File metadata

File hashes