About the network I'm trying to learn itself - Input layers are 1039 in number. I use only one hidden layer and one output layer. I have two biases one for hidden and one for output layer and one other weight for the only layer I have. Thus I'm trying to learn 1042 weights. I use tangent sigmoid as my hidden layer TF and log sigmoid as my output layer TF. Have only two classes to classify into.
Scaled Conjugate Gradient - NN toolbox
9 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Pooja Narayan
el 12 de Ag. de 2014
Respondida: saba momeni
el 1 de Feb. de 2019
Hi,
I have used MATLAB's 'trainscg' with 'mse' as the performance function and NETLAB's 'scg' with 'mse' as the performance function for the same training data set and still don't obtain the same generalisation on a set of other data files I have.
I have used same the same Nguyen Widrow initialisation method for weight and bias initialisation. Used the same 'dividerand' method to split the data sets into training, validation and testing data.
I know the difference could be in the various parameters used. In the original paper, http://www.sciencedirect.com/science/article/pii/S0893608005800565; the lambda values are specified not as exact values but as inequalities. I have used values that don't violate the rules laid down by the author.
Also, one thing that seems a bit bizarre to me is that MATLAB stops the learning in just 23 epochs but NETLAB exceeds maximum iterations. I understand stopping criteria may be different.
Is there anyone there who has worked on both of these toolboxes and found a way of establishing same results from both of them? I want some general ideas and tips to making SCG give similar results to MATLAB's TRAINSCG.
Any help, advise will be greatly appreciated.
Thank you. Pooja
Respuesta aceptada
Greg Heath
el 12 de Ag. de 2014
Your description is incorrect and confusing.
[I N ] = size(input) % = ?
[ O N ] = size(target) % = ?
Ntrn = ? % Matlab default = N-2*round(0.15*N)
Ntrneq = Ntrn*O % Number of training equations
For an I-H-O net, the number of unknown weights to be estimated is
Nw = (I+1)*H+(H+1)*O % The "1s" are for biases
To prevent overfitting choose H so that Ntrneq >= Nw.
To prevent nonrobustness w.r.t. noise and interference choose Ntrneq >> Nw
Otherwise use regularization (trainbr or msereg) or validation subset stopping.
Nw can be lowered by removing input and/or hidden nodes.
I assume you mean you have 1039 input NODES. I doubt if you need that many. You should probably use input variable reduction (e.g., help PLSregress) to obtain a more reasonable number.
Need to know N, Ntrn and H. Need to reduce I.
Hope this helps.
Thank you for formally accepting my answer
Greg
Más respuestas (1)
saba momeni
el 1 de Feb. de 2019
Hi everyone
I am training my feedfoward neural network. with scale conjugate gradient.
I am not sure that scale conjugate gradient dose optimization in bach or with mini-batch training?
I just specify the Lambada and the Sigma for it , no size of batch.
I appreciate your answer.
Cheers
S
0 comentarios
Ver también
Categorías
Más información sobre Deep Learning Toolbox en Help Center y File Exchange.
Productos
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!