Classification Score "fitcensemble" with Decision Trees - Ambiguous Matlab Documentation

3 visualizaciones (últimos 30 días)
Hey,
I try to figure out how the classification score is calculated when using cecision trees with the fitcensemble function. In my opinion, the following link is ambiguous:
First, it is said that the score is equal to the following:
"A matrix with one row per observation and one column per class. For each observation and each class, the score generated by each tree is the probability of this observation originating from this class computed as the fraction of observations of this class in a tree leaf. predict averages these scores over all trees in the ensemble"
However this definition would end up (in my understanding) in a score element of [0,1] which is not the case when applying fitcensemble. Instead, a ScoreTransform is required as explained in https://de.mathworks.com/matlabcentral/answers/395526-how-do-i-obtain-scores-as-probabilistic-estimates-using-the-predict-function-on-a-fitcensemble-model.
Furthermore, the first link also provides the following definition: "Different ensemble algorithms have different definitions for their scores. Furthermore, the range of scores depends on ensemble type."
So could anyone explain what the real definition of score is when using fitcensemble with Decision Trees (does it depend on Boosting or Bagging?)
Thanks for your help!

Respuestas (1)

Aditya Patil
Aditya Patil el 20 de Ag. de 2020
The statement about score in Output Arguments section of compact classification ensemble is about individual trees. Trees do indeed give probability as score,
load fisheriris.mat
mdl = fitctree(meas, species);
[~, score] = predict(mdl, meas);
sum(score, 2)
However, in case of ensemble, this depends upon how the ensemble technique calculates score. This is explained in the document for ensemble algorithms. This answer explains how to convert these scores to probabilities. Note that it might not be trivial/obvious how to do so in all cases.
  2 comentarios
Dario Walter
Dario Walter el 20 de Ag. de 2020
Dear Aditya,
thanks for your reply. The Output Arguments section you mentioned is not only about individual trees. The last sentence is "... predict averages these scores over all trees in the ensemble". This should be changed since, as you mentioned, the score depends on the ensemble technique.
Aditya Patil
Aditya Patil el 25 de Ag. de 2020
I have brought this issue to the notice of the concerned people.

Iniciar sesión para comentar.

Categorías

Más información sobre Classification en Help Center y File Exchange.

Productos


Versión

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by