How to use Levenberg-Marquardt backprop with GPU?

Question

psimeson el 13 de Dic. de 2019

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/496546-how-to-use-levenberg-marquardt-backprop-with-gpu

Comentada: Amanjit Dulai el 17 de En. de 2020

Levenberg-Marquardt backprop train my shallow neural net very efficienetly and gives a very good result. However, it doesn't seem to support GPU training. Is there a way to implement GPU support for Levenberg-Marquardt backprop?

Thanks

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Joss Knight el 15 de Dic. de 2019

1
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/496546-how-to-use-levenberg-marquardt-backprop-with-gpu#answer_406421

This isn't supported out of the box yet. You could convert your network to use dlarray and train it with a custom training loop. Then you could write your own LevMarq solver.

10 comentarios
Mostrar 8 comentarios más antiguosOcultar 8 comentarios más antiguos

psimeson el 6 de En. de 2020

Please keep me in the loop. Thanks a lot.

Amanjit Dulai el 17 de En. de 2020

Here is how you might translate the example above to use some of our newer functionality. In the example below, we train the network you described earlier on a simple regression problem. See the function 'iNetworkForward' for the definition of the network:

% Load the data
[X,T] = simplefit_dataset;
inputSize = size(X,1);
outputSize = size(T,1);
% Split the data into test and training data
rng('default');
testFraction = 0.15;
[XTrain, TTrain, XTest, TTest] = ...
    iSplitIntoTrainAndTestSets(X, T, testFraction);
% Initialize the weights for the network
layerSizes = [10 20 20];
params = iInitializeWeights(inputSize, layerSizes, outputSize);
% Specify the training options
executionEnvironment = "cpu";
velocity = [];
miniBatchSize = 20;
numEpochs = 1000;
numObservations = size(XTrain,2);
numIterationsPerEpoch = floor(numObservations./miniBatchSize);
% Cast the data to dlarray
XTrain = dlarray(XTrain, 'CB');
TTrain = dlarray(TTrain, 'CB');
if executionEnvironment == "gpu"
    XTrain = gpuArray(XTrain);
    TTrain = gpuArray(TTrain);
end
% Train the model
for epoch = 1:numEpochs
    for iteration = 1:numIterationsPerEpoch
        % Get a batch of data.
        indices = (iteration-1)*miniBatchSize+1:iteration*miniBatchSize;
        XBatch = XTrain(:,indices);
        TBatch = TTrain(:,indices);
        
        % Get the loss and gradients
        [loss, gradients] = dlfeval( ...
            @iNetworkForwardWithLoss, params, XBatch, TBatch );
        
        % Update the network
        [params, velocity] = sgdmupdate(params, gradients, velocity);
        
        % Report the loss
        fprintf('Loss: %f\n', extractdata(loss));
    end
end
% Run the network on test data
XTest = dlarray(XTest, 'CB');
YTest = iNetworkForward(params, XTest);
YTest = extractdata(YTest);
% Plot the ground truth and predicted data
plot([TTest' YTest']);
%% Helper functions
function [XTrain,TTrain,XTest,TTest] = iSplitIntoTrainAndTestSets( ...
    X, T, testFraction )
numObservations = size(X,2);
idx = randperm(numObservations);
splitIndex = floor(numObservations*testFraction);
testIdx = idx(1:splitIndex);
trainIdx = idx((splitIndex+1):end);
XTrain = X(:,trainIdx);
TTrain = T(:,trainIdx);
XTest = X(:,testIdx);
TTest = T(:,testIdx);
end
function params = iInitializeWeights(inputSize, layerSizes, outputSize)
params = struct;
params.W1 = dlarray( iGlorot(inputSize, layerSizes(1)) );
params.b1 = dlarray( zeros([layerSizes(1) 1]) );
params.W2 = dlarray( iGlorot(layerSizes(1), layerSizes(2)) );
params.b2 = dlarray( zeros([layerSizes(2) 1]) );
params.W3 = dlarray( iGlorot(layerSizes(2), layerSizes(3)) );
params.b3 = dlarray( zeros([layerSizes(3) 1]) );
params.W4 = dlarray( iGlorot(layerSizes(3), outputSize) );
params.b4 = dlarray( zeros([outputSize 1]) );
end
function weights = iGlorot(fanIn, fanOut)
weights = (2*rand([fanOut fanIn])-1) * sqrt(6/(fanIn+fanOut));
end
function Y = iNetworkForward(params, X)
Z1 = fullyconnect(X, params.W1, params.b1);     % 1st fully connected layer
Z1 = sigmoid(Z1);                               % Logistic sigmoid
Z2 = fullyconnect(Z1, params.W2, params.b2);    % 2nd fully connected layer
Z2 = exp(-Z2.^2);                               % Radial basis function
Z3 = fullyconnect(Z2, params.W3, params.b3);    % 3rd fully connected layer
Z3 = sigmoid(Z3);                               % Logistic sigmoid
Y = fullyconnect(Z3, params.W4, params.b4);     % 4th fully connected layer
end
function [loss, dLossdW] = iNetworkForwardWithLoss(weights, X, T)
Y = iNetworkForward(weights, X);
loss = mse(Y, T)/size(T,1);
dLossdW = dlgradient(loss, weights);
end

There are a few differences from doing things this way compared to using 'feedforwardnet':

As you mentioned, 'feedforwardnet' is trained with Levenberg Marquardt by default. But the example above is using stochastic gradient descent with momentum, which is simpler.
The example above uses the 'Glorot' weight initializer, which is a more modern technique associated with Deep Learning. 'feedforwardnet' uses the Nguyen-Widrow method.
The example above does not perform any scaling on the data. 'feedforwardnet' will by default rescale the input and target data to the range -1 to 1. Sometimes this can help training.

Iniciar sesión para comentar.

How to use Levenberg-Marquardt backprop with GPU?

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (1)

10 comentarios
Mostrar 8 comentarios más antiguosOcultar 8 comentarios más antiguos

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

How to use Levenberg-Marquardt backprop with GPU?

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (1)

10 comentarios Mostrar 8 comentarios más antiguosOcultar 8 comentarios más antiguos

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

10 comentarios
Mostrar 8 comentarios más antiguosOcultar 8 comentarios más antiguos