Recognize overfitting in retraining

3 visualizaciones (últimos 30 días)
Federico Ambrogio
Federico Ambrogio el 25 de Sept. de 2015
Comentada: Greg Heath el 27 de Sept. de 2015
I wrote the following code, inspired of those proposed in the neural network toolbox manual, to retrain a network
load dati_MRTA.mat% where IN_MRTA=13x49 double and TARGET_MRTA=1x49 double
Q=size(IN_MRTA,2);
Q1=floor(Q*0.9);
Q2=Q-Q1;
ind=randperm(Q);
ind1=ind(1:Q1);
ind2=ind(Q1+(1:Q2));
x1=IN_MRTA(:,ind1);
t1=TARGET_MRTA(:,ind1);
x2=IN_MRTA(:,ind2);
t2=TARGET_MRTA(:,ind2);
net=feedforwardnet(13,'trainlm');
numNN=10;
NN=cell(1,numNN);
tr=cell(1,numNN);
perfs=zeros(3,numNN);
for i=1:numNN
disp(['Training ' num2str(i) '/' num2str(numNN)])
[NN{i},tr{i}]=train(net,x1,t1);
y2=NN{i}(x2);
perfs(1,i)=sqrt(tr{i}.best_perf);
perfs(2,i)=sqrt(tr{i}.best_vperf);
perfs(3,i)=sqrt(mse(net,t2,y2));
end
best results I've obtained during the same iteration are RMSEtraining=4.8730 RMSEvalidation=7.8195 RMSEtest=10.3158, the corresponding performanec plot is the following:
it does reprents a good result or it is and indication of possible overfitting?
  1 comentario
Greg Heath
Greg Heath el 26 de Sept. de 2015
Either post your data or choose an example from MATLAB's NN examples.
help nndatasets
and
doc nndatasets
Hope this helps.
Greg

Iniciar sesión para comentar.

Respuesta aceptada

Greg Heath
Greg Heath el 27 de Sept. de 2015
Recognize overfitting in retraining Asked by Federico Ambrogio on 25 Sep 2015 at 9:47 I wrote the following code, inspired of those proposed in the neural network toolbox manual, to retrain a network load dati_MRTA.mat% where IN_MRTA=13x49 double and TARGET_MRTA=1x49 double
1. OVERFITTING and OVERTRAINING
When
[ I N ] = size(input) % [ 13 49 ]
[ O N ] = size(target) % [ 1 49 ]
and
H is the number of hidden nodes % 13
Nw = (I+1)*H+(H+1)*O = 14*13+14*1 = 196
is the number of unknown weights that have to be estimated. OVERFITTING occurs when there are more unknown weights than there are training equations
Nw =196 > 49 = N*O >= Ntrn*O = Ntrneq
Problems occur when you OVERTRAIN an OVERFIT net. Three methods of preventing overtraining are
a. Do not overfit H <= Hub (upperbound) where,
using the MATLAB default Ntrn ~ 0.7*N
Hub = floor(( Ntrn*O-O)/(I+O+1))
= floor(( 0.7*N*O-O)/(I+O+1) ) = 2
b. Validation Stopping
c. Regularization using MSEREG or TRAINBR.
2. Train automatically divides the data into three
subsets trn/val/tst with ratios 0.7/0.15/0.15.
Therefore, either accept this or EXPLICITLY override it with net.divideParam.trainRatio, etc. to get the three unit-sum ratios. There is no need for you to explicitly divide the data !
3. When training in a loop, the net must be reconfigured with function CONFIGURE at the beginning of each pass through the loop. Otherwise you will just continue to train the same net with initial weights equal to the final weights obtained from the previous pass.
4. Initialize the RNG before using the 1st random number so you can reproduce your results that depend on both random data division and random weight intialization.
5. Use the regression function fitnet instead of the general function feedforwardnet.
6. You can obtain zillions of examples by searching both the NEWSGROUP and ANSWERS using the search words
greg fitnet tutorial
greg fitnet Hub
greg fitnet Ntrials
Hope this helps.
Thank you for formally accepting my answer
Greg
  1 comentario
Greg Heath
Greg Heath el 27 de Sept. de 2015
For regession, the best measures of performance are the normalize mse
NMSE = mse(error)/ mean(var(target',1))
and the coefficient of regression (AKA) the Rsquared
Rsq = 1- NMSE
which is interpreted as the fraction of target variance that is modeled by the net.
For regression I set my training goal so that NMSE <= 0.01 i.e., Rsq > 0.99
Hope this helps.
Greg

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Image Data Workflows en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by