Cannot test model using cross validation using crossval and kFoldLoss

Question

Dhruv Ghulati el 25 de Dic. de 2015

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/261620-cannot-test-model-using-cross-validation-using-crossval-and-kfoldloss

I am very new to machine learning, but due to my course I have followed the materials and been able to fit a random forest on my data, and get an error rate that makes sense (beats a dumb prediction and gets better with better chosen features).

My predictor matrix (zscored, this is a subset) is:

-0.0767889379600161 1.43666113298993    4.83220576535887    4.59650550158967
-0.0767889379600161 -0.114493297876403  -0.217229093905045  -0.187718580390875
-0.0767889379600161 -0.114493297876403  -0.217229093905045  -0.187718580390875
-0.0767889379600161 -0.114493297876403  -0.187208672625236  -0.00955946380486005
-0.0767889379600161 -0.114493297876403  -0.217229093905045  -0.187718580390875
-0.0767889379600161 -0.114493297876403  -0.217229093905045  -0.187718580390875
7.39424877391969    1.12643024681666    -0.145180082833503  -0.187718580390875
-0.0767889379600161 2.05712290533646    -0.211225009649084  -0.187718580390875
-0.0767889379600161 0.195737588296863   1.35584098115696    0.229434473078818

And my response is:

'Highly Active'
'Inactive'
'Inactive'
'Inactive'
'Inactive'
'Highly Active'
'Highly Active'
'Highly Active'
'Inactive'
'Highly Active'
'Inactive'
'Highly Active'

My previous method was:

rng default
c = cvpartition(catresponse, 'HoldOut', 0.3);
% Extract the indices of the training and test sets.
trainIdx = training(c);
testIdx = test(c);
% Create the training and test data sets.
XTrain = predictormatrix(trainIdx, :);
XTest = predictormatrix(testIdx, :);
yTrain = catresponse(trainIdx);
yTest = catresponse(testIdx);
% Create an ensemble of 100 trees.
forestModel = fitensemble(XTrain, yTrain, 'Bag', 100,...
                            'Tree', 'Type', 'Classification'); 
% Predict and evaluate the ensemble model.
forestPred = predict(forestModel, XTest);
% errs = forestPred ~= yTest;
% testErrRateForest = 100*sum(errs)/numel(errs);
% display(testErrRateForest)
% Perform 10-fold cross validation.
cvModel = crossval(forestModel); % 10-fold is default 
cvErrorForest = 100*kfoldLoss(cvModel);
display(cvErrorForest)
% Confusion matrix.
C = confusionmat(yTest, forestPred);
figure(figOpts{:})
imagesc(C)
colorbar
colormap('cool')
[Xgrid, Ygrid] = meshgrid(1:size(C, 1));
Ctext = num2str(C(:));
text(Xgrid(:), Ygrid(:), Ctext)
labels = categories(catresponse);
set(gca, 'XTick', 1:size(C, 1), 'XTickLabel', labels, ...
         'YTick', 1:size(C, 1), 'YTickLabel', labels, ...
         'XTickLabelRotation', 30, ...
         'TickLabelInterpreter', 'none')
xlabel('Predicted Class')
ylabel('Known Class')
title('Forest Confusion Matrix')

Questions:

Am I doing my cross validation in the right way - my cvLoss code is based on a model built using the 30% holdout, and not something like cvpartition KFold so I am concerned about what cvLoss is actually calculating here.
Is my cross validation confusion matrix based on the cross validation, or the simpler holdout version with the above code?
How can I alter my code so that the whole model is "cross validated"?

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Cannot test model using cross validation using crossval and kFoldLoss

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (0)

Ver también

Categorías

Etiquetas

Productos

Community Treasure Hunt

Cannot test model using cross validation using crossval and kFoldLoss

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (0)

Ver también

Categorías

Etiquetas

Productos

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos