Not able to calculate gradient of loss function in a neural network program

Question

Dr. Veerababu Dharanalakota el 8 de Abr. de 2023

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/1943639-not-able-to-calculate-gradient-of-loss-function-in-a-neural-network-program

Comentada: Jast el 4 de En. de 2024

Respuesta aceptada: Richard

Abrir en MATLAB Online

Hi,

I am trying to solve a phisics-informed neural network problem in which I constructed a loss function as follows

function [loss,gradients] = loss_fun(parameters,x,C,alpha)
    % C is a complex-valued constant
    % alpha is a real-valued constant
    NN = model(parameters,x);                       % Feedforward neural network 
    f = C*NN;                                       % Intermediate function
    
    g = fxx+alpha*f;                                % Objective function
    
    gr = real(g);                                   % Real-part of g
    gi = imag(g);                                   % Imaginary-part of g
    
    zeroTarget_r = zeros(size(gr),"like",gr);       % Zero targets for the real-part    
    loss_r = l2loss(gr, zeroTarget_r);              % Real-part loss function
    
    zeroTarget_i = zeros(size(gi),"like",gi);       % Zero targets for the imaginary-part
    loss_i = l2loss(gi, zeroTarget_i);              % Imaginary-part loss function
    
    loss = loss_r+loss_i;                           % Total loss function (real-valued)
    
    gradients = dlgradient(loss,parameters);        % Loss function gradients with respect to parameters
end

The function 'model' returns a feedforward neural network

. I would like the minimize the function g with respect to the parameters (θ). The input variable x as well as the parameters θ of the neural network are real-valued. Here,

which is a double derivative of f with respect to x, is calculated as

. The presence of complex-valued constant C makes the objective function g a complex-valued. Hence, I split it into real and imaginary parts, calculated individual loss functions and added them.

While calculating the gradients I am encountering the following error

"Encountered complex value when computing gradient with respect to an input to fullyconnect. Convert all inputs to fullyconnect to real".

I checked indivial loss values and the parameter values. They are purely real.

I would be grateful to you if you could tell possible reasons for the error and resolution steps.

I am using fmincon with lbfgs hessian approximation for the optimization.

2 comentarios
Mostrar NingunoOcultar Ninguno

Richard el 17 de Mayo de 2023

I posted an answer regarding the complex value issue, but as an aside, you might be interested in the lbfgsupdate function which was recently added to Deep Learning Toolbox in R2023a.

Dr. Veerababu Dharanalakota el 18 de Mayo de 2023

Thank you, Richard. Addition of lbfgsupdate function is a great help for the researchers working on physics-informed neural networks.

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Richard el 17 de Mayo de 2023

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/1943639-not-able-to-calculate-gradient-of-loss-function-in-a-neural-network-program#answer_1238034

Abrir en MATLAB Online

I think this may be due to your introduction of the complex value into the output of the model, NN. Even though you are later splitting this into two real halves, the gradient backwards computation will be stepping back through this (complex) C*(real) NN operation which reintroduces a complex gradient during the backwards.

Try calling NN = real(NN) before this step to insulate the real-valued model from the complex part of the calculation:

    NN = model(parameters,x);                       % Feedforward neural network 
    NN = real(NN);
    f = C*NN;                                       % Intermediate function

It may seem counter-intuitive to apply this before the complex values are created, and indeed in the forwards computation this will have no effect because NN is already real. But in the backwards pass for gradients the computation flows in the other direction through the code, and the backwards for real(NN) will be after the backwards for C*NN. It will discard the imaginary parts of the gradient, which at this point have no meaning because there is no imaginary part of the NN value.

4 comentarios
Mostrar 2 comentarios más antiguosOcultar 2 comentarios más antiguos

Richard el 18 de Mayo de 2023

Abrir en MATLAB Online

Thanks for the code, I do reproduce the same issue. The reason that one call to real(U) did not fix it is that the dlgradient calls for Ux and Uxx are creating additional backward passes through U after your real(U) call. This means that they are before the real(U) when the final backwards pass is performed in the last dlgradient call.

The solution is to move/add more real(...) "assertions" so that they cover the Uxx output as well. You can either do this by adding a Uxx=real(Uxx) call after the dlgradient line that creates Uxx, or you can perform the real() call after U and Uxx are combined together, i.e. on the f variable, right before it is multiplied by the complex constant:

function [loss,gradients] = modelLoss2(net,X,X0,U0,k)
C = 2+3j;
% Make predictions with the initial conditions.
U = forward(net,X);
% Calculate derivatives with respect to X.
Ux = dlgradient(sum(U,"all"),X,EnableHigherDerivatives=true);
% Calculate second-order derivatives with respect to X.
Uxx = dlgradient(sum(Ux,"all"),X,EnableHigherDerivatives=true);
% Calculate mseF. Enforce Helmholtz equation.
f = Uxx + k^2*U;
% Enforce that f is real during forwards and backwards calculations
f = real(f);
f = f*C;
f_r = real(f);
f_i = imag(f);
zeroTarget_r = zeros(size(f_r),"like",f_r);
loss_r = l2loss(f_r,zeroTarget_r);
zeroTarget_i = zeros(size(f_i),"like",f_i);
loss_i = l2loss(f_i,zeroTarget_i);
U0Pred = forward(net,X0);
loss_b = l2loss(U0Pred,U0);
loss = loss_r + loss_i + loss_b;
% Calculate gradients with respect to the learnable parameters.
gradients = dlgradient(loss,net.Learnables);
end

Dr. Veerababu Dharanalakota el 19 de Mayo de 2023

Thank you, Richard. This resolved the issue.

Jast el 4 de En. de 2024

This answer is really appreciated. Thanks so much! It helped me during a late night debugging session!!!

Iniciar sesión para comentar.

Answer 2

Kartik el 17 de Mayo de 2023

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/1943639-not-able-to-calculate-gradient-of-loss-function-in-a-neural-network-program#answer_1237889

Abrir en MATLAB Online

Hi,

The error message suggests that there is a complex value in the input to the fully connected layer of your neural network model. This could be due to the fact that the output of the intermediate function "f" includes a complex constant "C" multiplied by the neural network output "NN". If "C" is complex, then "f" will be complex-valued as well, and the subsequent computations involving "f" may introduce complex values.

To resolve this error and perform backpropagation through your neural network, you need to ensure that all inputs to the neural network are real-valued. One way to do this would be to separate the real and imaginary parts of the complex input to the fully connected layer, and pass them separately as inputs. You can do this by using the "real" and "imag" functions to extract the real and imaginary parts of "f" separately:

NN = model(parameters,x);                              % Feedforward neural network 
f = C*NN;                                              % Intermediate function
f_real = real(f);                                      % Real part of f
f_imag = imag(f);                                      % Imaginary part of f
fc_in = [f_real; f_imag];                               % Concatenate f_real and f_imag
fc_out = fullyconnect(fc_in, weights_fc, bias_fc);      % Fully connected layer output

Here, the "fc_in" matrix is formed by concatenating the real and imaginary parts of "f", and then passed to the fully connected layer.

Refer to the following MathWorks documentation for more information:

https://www.mathworks.com/help/deeplearning/ug/train-network-with-complex-valued-data.html?searchHighlight=complex%20valued%20neural%20networks&s_tid=srchtitle_complex%2520valued%2520neural%2520networks_3

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Kartik el 18 de Mayo de 2023

Yes, that can be a possible work around, if we seperate the real and imaginary parts and perform all the other calculations like loss calculation and gradient descent on them seperately.

Dr. Veerababu Dharanalakota el 19 de Mayo de 2023

Okay. Splitting real and imaginary parts results in several loss functions which are to be optimized simultaneously. It may pose a problem during training. But, I will give a try and get back to you.

Iniciar sesión para comentar.

Not able to calculate gradient of loss function in a neural network program

2 comentarios
Mostrar NingunoOcultar Ninguno

Respuesta aceptada

4 comentarios
Mostrar 2 comentarios más antiguosOcultar 2 comentarios más antiguos

Más respuestas (1)

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

Not able to calculate gradient of loss function in a neural network program

2 comentarios Mostrar NingunoOcultar Ninguno

Respuesta aceptada

4 comentarios Mostrar 2 comentarios más antiguosOcultar 2 comentarios más antiguos

Más respuestas (1)

3 comentarios Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

2 comentarios
Mostrar NingunoOcultar Ninguno

4 comentarios
Mostrar 2 comentarios más antiguosOcultar 2 comentarios más antiguos

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo