DecisionTree·PyPI

A Python module for constructing a decision tree from multidimensional training data and for using the decision tree for classifying new data

These details have not been verified by PyPI

Project links

Project description

Version 2.1 is a cleaned up version of Version 2.0. This new version should run faster on large training data files.

Version 2.0 is a major rewrite of the DecisionTree module. This revision was prompted by a number of users wanting to see numeric features incorporated in the construction of decision trees. So here it is! This version allows you to use either purely symbolic features, or purely numeric features, or a mixture of the two. (A feature is numeric if it can take any floating-point value over an interval.)

With regard to the purpose of the module, assuming you have arranged your training data in the form of a table in a text file, all you have to do is to supply the name of the training data file to this module and it does the rest for you without much effort on your part. A decision tree classifier consists of feature tests that are arranged in the form of a tree. The feature test associated with the root node is one that can be expected to maximally disambiguate the different possible class labels for an unlabeled data record. From the root node hangs a set of child nodes, one for each value of the feature at the root node. At each such child node, a feature test is selected that is the most class discriminative given that you have already applied the feature test at the root node and observed the value for that feature. This process is continued until you reach the leaf nodes of the tree. The leaf nodes may either correspond to the maximum depth desired for the decision tree or to the case when you run out of features to test.

Typical usage syntax:

training_datafile = "stage3cancer.csv"

dt = DecisionTree.DecisionTree(
                training_datafile = training_datafile,
                csv_class_column_index = 2,
                csv_columns_for_features = [3,4,5,6,7,8],
                entropy_threshold = 0.01,
                max_depth_desired = 8,
                symbolic_to_numeric_cardinality_threshold = 10,
     )
  dt.get_training_data()
  dt.calculate_first_order_probabilities()
  dt.calculate_class_priors()
  dt.show_training_data()

  root_node = dt.construct_decision_tree_classifier()
  root_node.display_decision_tree("   ")

  test_sample  = ['g2 = 4.2',
                  'grade = 2.3',
                  'gleason = 4',
                  'eet = 1.7',
                  'age = 55.0',
                  'ploidy = diploid']

  classification = dt.classify(root_node, test_sample)
  print "Classification: ", classification

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

3.4.3

May 14, 2016

3.4.2

May 2, 2016

3.4.1

May 1, 2016

3.4.0

Apr 4, 2016

3.3.2

Feb 13, 2016

3.3.1

Jan 31, 2016

3.3.0

Jan 26, 2016

3.2.4

Nov 23, 2015

3.2.3

Oct 26, 2015

3.2.2

Oct 25, 2015

3.2.1

Jun 14, 2015

3.2.0

Jun 10, 2015

3.0.1

May 28, 2015

3.0

May 18, 2015

2.3.4

May 5, 2015

2.3.3

Mar 27, 2015

2.3.2

Mar 22, 2015

2.3.1

Mar 17, 2015

2.3

Mar 16, 2015

2.2.6

Mar 11, 2015

2.2.5

Nov 25, 2014

2.2.4

Jun 18, 2014

2.2.3

Jun 13, 2014

2.2.2

May 3, 2014

2.2.1

Sep 5, 2013

2.2

Sep 2, 2013

This version

2.1

Aug 17, 2013

2.0

Jun 19, 2013

1.7.1

May 22, 2013

1.7

Jul 29, 2012

1.6.1

Jun 22, 2012

1.6

Jun 20, 2012

1.5

May 16, 2011

1.0

May 16, 2011

DecisionTree 2.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed