This is a very simple learner, but it may be useful as a baseline to compare your learner against.
The decisionstump learner works in time proportional to the number of training examples. It also requires memory that is proportional to the number of classes * number attributes * number of values. Note that this can be very large for continuous attributes (which, in the worst case, have a separate value for each training example). The maxThresholds argument can be used to control this.
The learner takes input and does output in c4.5 format. It expects to find the files <stem>.names
and <stem>.data.
Depending on command line argument, it will either output the decision stump or test its error rate on <stem>.test
.
decisionstump -f banana -source datasets/banana
Looks for a dataset named 'banana' in the 'datasets/banana' directory. Outputs the decision stump learned from the data set.