machine-learning Tutorial => Putting it together: Training a tree

Example

Trees can build models that work on independent features and on second order effects. So they might be good candidates for this domain. Trees are rules that are chaind together, a rule splits instances that arrive at a rule in subgroups, that pass to the rules under the rule.

Tree Learners generate rules, chain them together and stop building trees when they feel the rules get too specific, to avoid overfitting. Overfitting means constructing a model that is too complex for the concept we are looking for. Overfitted models perform well on the train data, but poorly on new data

We use J48, a JAVA implementation of C4.5 a popular algorithm.

//We train a tree using J48
//J48 is a JAVA implementation of the C4.5 algorithm
J48 classifier4 = new J48();
//We set it's confidence level to 0.1
//The confidence level tell J48 how specific a rule can be before it gets pruned
classifier4.setOptions(weka.core.Utils.splitOptions("-C 0.1"));
classifier4.buildClassifier(trainset);
// Next we test it against the testset
Test = new Evaluation(trainset);
Test.evaluateModel(classifier4, testset);
System.out.println(Test.toSummaryString());

System.out.print(classifier4.toString());

//We set it's confidence level to 0.5
//Allowing the tree to maintain more complex rules
classifier4.setOptions(weka.core.Utils.splitOptions("-C 0.5"));
classifier4.buildClassifier(trainset);
// Next we test it against the testset
Test = new Evaluation(trainset);
Test.evaluateModel(classifier4, testset);
System.out.println(Test.toSummaryString());
     
System.out.print(classifier4.toString());

The tree learner trained with the highest confidence generates the most specific rules, and has the best performance on the test set, appearently the specificness is warranted.

Note: Both learners start with a rule on petal-width. Remember how we noticed this dimension in the visualization ?

PDF - Download machine-learning for free

Previous Next

machine-learning

Fastest Entity Framework Extensions

Example

Got any machine-learning Question?

machine-learning

machine-learning An introduction to Classificiation: Generating several models using Weka Putting it together: Training a tree

Fastest Entity Framework Extensions

Example

Got any machine-learning Question?