Borrar filtros
Borrar filtros

Decision/Regression stumps using Classifica​tionTree.t​emplate or RegressionTree.template

1 visualización (últimos 30 días)
I wonder if it is possible to control the number of splits that the fitensemble function is allowed to make for each single tree. I have looked at the ClassificationTree.template but have not found how number of splits are controlled. How would I set up ClassificationTree.template if for example stumps were desired?

Respuesta aceptada

Ilya
Ilya el 11 de Oct. de 2011
You get stumps by default for any boosting algorithm from fitensemble. Or, equivalently, you can set 'minparent' to the number of observations, as you noted. The trees are saved in the Trained property. You can inspect them by executing view method, for example, view(ens.Trained{1}).
For anything other than stump, you can control the number of splits approximately by minleaf or minparent arguments. I haven't seen a case where controlling the number of splits would produce a better accuracy for the overall ensemble than controlling the leaf size.

Más respuestas (2)

Miro
Miro el 11 de Oct. de 2011
Some comments to my original post:
Currently I am using the MinLeaf - option to approximately control the number of splits, but I what I really want is to enforce a certain number of splits. I am not able to do so even for the simplest case (stumps) using the MinLeaf or MinParent options:
For example training a GentleBoost ensemble with MinLeaf being N/2 (where N is the number of examples in the training set) does not produce similar results to training a stump-based GentleBoosted classifier using my own code or other boosting packages. Neither does MinParent = N, which according to my understanding of the documentation should enforce stumps.

Miro
Miro el 11 de Oct. de 2011
I see. I assumed that what was used was not stumps, due to some differences in test-set-performance with another implementation. Now I see (using the view function) that stumps are actually used with MinParent=N (and by default).
However, as I understand, it is not possible (without modifying the statistics package) to fix the number of splits (except for the 1-split-case) other than approximately?

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by