Why Are Hidden State and Cell State Vectors Zero After Training an LSTM Model with trainNetwork Functionality?

Question

Shubham Baisthakur el 10 de Oct. de 2023

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/2031634-why-are-hidden-state-and-cell-state-vectors-zero-after-training-an-lstm-model-with-trainnetwork-func

Comentada: Shubham Baisthakur el 20 de Oct. de 2023

I am training an LSTM model using the trainNetwork functionality and follwing is the architecture of my model:

    layers = [ ...
        sequenceInputLayer(size(X_train{1},1))
        layerNormalizationLayer
        lstmLayer(x.num_hidden_units,'OutputMode','sequence')
        fullyConnectedLayer(x.num_layers_ffnn)
        dropoutLayer(0.1)
        fullyConnectedLayer(1)
        regressionLayer];
    

And I am training this using the following command:

        options = trainingOptions('adam', ...
        'MaxEpochs', 75, ...
        'MiniBatchSize', x.batch_size, ...
        'SequenceLength', 'longest', ...
        'Shuffle', 'once', ...
        'L2Regularization',0.01,...
        'ValidationData',{X_val,Y_val}, ...
        'ValidationFrequency',10,...
        'Verbose',false,...
        'ExecutionEnvironment','multi-gpu');    
    % Train the LSTM network
    net = trainNetwork(X_train, Y_train, layers, options);
    

After training the model, the Hidden state and Cell state values for the LSTM layer is a vector of zeros. Why is this happening? I expect these vectors to have non-zero values to ensure the long term dependency between input and output parameters is captured.

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Neha el 20 de Oct. de 2023

1
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/2031634-why-are-hidden-state-and-cell-state-vectors-zero-after-training-an-lstm-model-with-trainnetwork-func#answer_1337381

Hi Shubham,

The LSTM (Long Short-Term Memory) layer in a neural network is designed to remember values over arbitrary time intervals which indeed helps in maintaining and learning long-term dependencies. However, after training, the hidden and cell states of the LSTM layer are reset to zero. This is standard behavior for LSTMs, and it doesn't mean that the LSTM layer has not learned anything or that it's not working properly.

If you want to maintain the state of LSTM for some reason (like in case of time series prediction where you want the model to remember the state from the previous sequence), you can refer to the explanation for Open Loop Forecasting and Closed Loop Forecasting in the following documentation link:

https://www.mathworks.com/help/deeplearning/ug/time-series-forecasting-using-deep-learning.html

Here "predictAndUpdateState" function has been used which updates the network state at every timestep.

Hope this helps!

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Shubham Baisthakur el 20 de Oct. de 2023

Thanks, Neha! Refering to the 'predictAndUpdateState' function you mentioned, I was wondering if this is applicable to LSTM networks with multivariate input features? The example in the attached link talks about using the previous time steps of a signal to predict the future steps, which is not the kind of problem I am working on.

Iniciar sesión para comentar.

Why Are Hidden State and Cell State Vectors Zero After Training an LSTM Model with trainNetwork Functionality?

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Más respuestas (0)

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

Why Are Hidden State and Cell State Vectors Zero After Training an LSTM Model with trainNetwork Functionality?

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

1 comentario Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Más respuestas (0)

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos