Weights and Biases Not Updating in Custom MATLAB dlnetwork Training Loop

Question

SYED el 30 de Jun. de 2024

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/2133221-weights-and-biases-not-updating-in-custom-matlab-dlnetwork-training-loop

Editada: Ruth el 9 de Jul. de 2024

Hello MATLAB Community,

I am currently working on training a custom autoencoder network using MATLAB's dlnetwork framework. Despite setting up a manual training loop with gradient computation and parameter updates using adamupdate, I've observed that the weights and biases of the network do not change between iterations. Additionally, all biases remain zero throughout training. I am using Matlab R2024a. Here are the relevant parts of my code:

trailingAvgG = [];
trailingAvgSqG = [];
trailingAvgR = [];
trailingAvgSqR = [];
miniBatchSize = 600;
learnRate = 0.01;
layers = [
sequenceInputLayer(1,MinLength = 2048)
modwtLayer('Level',5,'IncludeLowpass',false,'SelectedLevels',2:5,"Wavelet","sym2")
flattenLayer
convolution1dLayer(128,8,Padding="same",Stride=8)
batchNormalizationLayer()
tanhLayer
maxPooling1dLayer(2,Padding="same")
convolution1dLayer(32,8,Padding="same",Stride=4)
batchNormalizationLayer
tanhLayer
maxPooling1dLayer(2,Padding="same")
transposedConv1dLayer(32,8,Cropping="same",Stride=4)
tanhLayer
transposedConv1dLayer(128,8,Cropping="same",Stride=8)
tanhLayer
bilstmLayer(8)
fullyConnectedLayer(8)
dropoutLayer(0.2)
fullyConnectedLayer(4)
dropoutLayer(0.2)
fullyConnectedLayer(1)];
 net = dlnetwork(layers);    
 numEpochs = 200;
%dataMat = 1x2048x22275
 dldata = arrayDatastore(dataMat,IterationDimension=3);
mbq = minibatchqueue(dldata,...
MiniBatchSize=miniBatchSize, ...
 OutputEnvironment= "cpu");
 iteration = 0;
for epoch = 1:numEpochs
         shuffle(mbq);
     
while hasdata(mbq)
iteration = iteration+1;
 [XTrain] = next(mbq);
 XTrain = dlarray(XTrain,"TBC"); % 1(C)x600(B)x2048(T)
[datafromRNN,lossR] = RNN_model(XTrain,net);
[gradientsR] = dlfeval(@gradientFunction,mean(lossR), net);
[net,trailingAvgR,trailingAvgSqR] = adamupdate(net,gradientsR, ...
                trailingAvgR,trailingAvgSqR,iteration,learnRate);
disp(['Iteration ', num2str(iteration), ', Loss: ', num2str(extractdata(lossR))]);
    end
end
function [gradientsR] = gradientFunction(lossR, net)
    gradientsR = dlgradient(lossR, net.Learnables);
end
function [datafromRNN,loss] = RNN_model(data,net)
z = data;
[coder, last] = forward(net, z, 'Outputs', {'maxpool1d_2', 'fc_3'});
loss = mse(last,z);
end

Questions:

Why are the weights and biases not updating, and why do the biases remain zero?
How can I ensure that the gradients computed are correct and being applied effectively?
Are there any specific settings or modifications I should consider to resolve this issue?

Any insights or suggestions would be greatly appreciated!

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Umar el 30 de Jun. de 2024

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/2133221-weights-and-biases-not-updating-in-custom-matlab-dlnetwork-training-loop#answer_1478821

Hi Syed,

To address the problem of weights and biases not updating during training, we need to ensure that the gradients are computed accurately and that the parameter updates are applied correctly. Let's make the necessary adjustments to the code:

% Initialize Adam optimizer parameters trailingAvgG = []; trailingAvgSqG = []; trailingAvgR = []; trailingAvgSqR = [];

% Define learning parameters miniBatchSize = 600; learnRate = 0.01;

% Define the neural network layers layers = [ % Your network layers here ];

net = dlnetwork(layers); numEpochs = 200;

% Assuming 'dataMat' is your input data dldata = arrayDatastore(dataMat, 'IterationDimension', 3); mbq = minibatchqueue(dldata, 'MiniBatchSize', miniBatchSize, 'OutputEnvironment', 'cpu');

iteration = 0; for epoch = 1:numEpochs shuffle(mbq);

    while hasdata(mbq)
        iteration = iteration + 1;
        [XTrain] = next(mbq);
        XTrain = dlarray(XTrain, 'TBC');

        % Call the RNN model function
        [datafromRNN, lossR] = RNN_model(XTrain, net);

        % Compute gradients and update parameters
        [gradientsR] = dlfeval(@gradientFunction, mean(lossR), net);
        [net, trailingAvgR, trailingAvgSqR] = adamupdate(net, gradientsR, trailingAvgR, trailingAvgSqR, iteration, learnRate);

        disp(['Iteration ', num2str(iteration), ', Loss: ', num2str(extractdata(lossR))]);
    end
end

function [gradientsR] = gradientFunction(lossR, net) gradientsR = dlgradient(lossR, net.Learnables); end

function [datafromRNN, loss] = RNN_model(data, net) z = data; [coder, last] = forward(net, z, 'Outputs', {'maxpool1d_2', 'fc_3'}); loss = mse(last, z); end

By ensuring that the gradients are correctly computed and the Adam optimizer updates the parameters accordingly, the weights and biases of the network should now change between iterations, leading to effective training progress.

Now let’s answer your questions.

Why are the weights and biases not updating, and why do the biases remain zero? If biases are not updating and remain zero, it could indicate a problem with the initialization or the learning rate being too low. Ensure that biases are initialized correctly, preferably with small random values to break symmetry. Additionally, consider adjusting the learning rate to a more suitable value that allows biases to update effectively.

How can I ensure that the gradients computed are correct and being applied effectively?

To verify the correctness of computed gradients and their effective application, you can employ various techniques:

Gradient Checking: Implement numerical gradient checking to compare computed gradients with numerical approximations. Discrepancies may indicate issues in gradient computation. Visualizing Gradients: Plot and analyze the gradients to ensure they follow expected patterns and magnitudes. Debugging Gradient Functions: Review the gradient computation function (gradientFunction) to ensure it correctly calculates gradients with respect to the loss.

Are there any specific settings or modifications I should consider to resolve this issue?

To enhance the training process and address the issues at hand, consider the following settings and modifications:

Learning Rate Adjustment: Experiment with different learning rates to find an optimal value that facilitates weight and bias updates without causing instability. Regularization Techniques: Introduce regularization methods like L1 or L2 regularization to prevent overfitting and aid in smoother weight updates. Batch Normalization: Verify the implementation of batch normalization layers to stabilize training and improve gradient flow. Network Architecture: Evaluate the complexity and design of your neural network architecture to ensure it is suitable for the task at hand and facilitates effective weight updates.

I hope this will help resolve your issues.

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Umar el 30 de Jun. de 2024

Sorry for the unformatted code

Iniciar sesión para comentar.

Answer 2

Ruth el 9 de Jul. de 2024

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/2133221-weights-and-biases-not-updating-in-custom-matlab-dlnetwork-training-loop#answer_1483211

Editada: Ruth el 9 de Jul. de 2024

Abrir en MATLAB Online

Hi Syed,

The forward call should also be inside a function called by dlfeval to ensure auto diff occurs as expected. I would recommend combining the loss and gradient calculations into one function to do this:

function [loss,gradientsR] = RNN_model(data,net)
z = data;
[coder, last] = forward(net, z, 'Outputs', {'maxpool1d_2', 'fc_3'});
loss = mean(mse(last,z));
gradientsR = dlgradient(loss, net.Learnables);
end

This should be called inside the loop using dlfeval:

[lossR, gradientsR] = dlfeval(@RNN_model,XTrain,net);

You might need to edit this a bit as I'm not completely sure of your code, for example where datafromRNN comes from however, moving the loss and gradient calculation into one function called by dlfeval should resolve the issue.

You can see an example where this is done here: https://uk.mathworks.com/help/deeplearning/ug/train-network-using-custom-training-loop.html

More info on auto diff in the Deep Learning Toolbox: https://uk.mathworks.com/help/deeplearning/ug/include-automatic-differentiation.html

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Weights and Biases Not Updating in Custom MATLAB dlnetwork Training Loop

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (2)

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

Weights and Biases Not Updating in Custom MATLAB dlnetwork Training Loop

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (2)

1 comentario Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos