MATLAB Answers

Error executing of the example code for training a custom Mask R-CNN using cocodataset 2014

17 views (last 30 days)
I followed the instructions in "Instance Segmentation Using Mask R-CNN Deep Learning" (ref[1]).
All the code worked perfectly until the last section "Train network" (ref[2]).
iteration = 1;
start = tic;
% Create subplots for the learning rate and mini-batch loss
fig = figure;
[lossPlotter] = helper.configureTrainingProgressPlotter(fig);
% Initialize verbose output
helper.initializeVerboseOutput([]);
% Custom training loop
for epoch = 1:numEpochs
reset(mbqTrain)
shuffle(mbqTrain)
while hasdata(mbqTrain)
% Get next batch from minibatchqueue
[X,gtBox,gtClass,gtMask] = next(mbqTrain);
% Evaluate the model gradients and loss using dlfeval
[gradients,loss,state] = dlfeval(@networkGradients,X,gtBox,gtClass,gtMask,dlnet,params);
dlnet.State = state;
% Compute the learning rate for the current iteration
learnRate = initialLearnRate/(1 + decay*iteration);
if(~isempty(gradients) && ~isempty(loss))
[dlnet.Learnables,velocity] = sgdmupdate(dlnet.Learnables,gradients,velocity,learnRate,momentum);
else
continue;
end
helper.displayVerboseOutputEveryEpoch(start,learnRate,epoch,iteration,loss);
% Plot loss/accuracy metric
D = duration(0,0,toc(start),'Format','hh:mm:ss');
addpoints(lossPlotter,numdetectMaskRCNN,Iteration,double(gather(extractdata(loss))))
subplot(2,1,2)
title(strcat("Epoch: ",num2str(epoch),", Elapsed: "+string(D)))
drawnow
iteration = iteration + 1;
end
end
net = dlnet;
% Save the trained network
modelDateTime = string(datetime('now','Format',"yyyy-MM-dd-HH-mm-ss"));
save(strcat("trainedMaskRCNN-",modelDateTime,"-Epoch-",num2str(numEpochs),".mat"),'net');
First, there is no "numdetectMaskRCNN" predefined.
I simply deleted it and reexecuted the section. It then showes the following error:
Error using nnet.internal.cnn.dlnetwork/forward (line 239)
Layer 'bn2a_branch2a': Invalid input data. The value of 'Variance' is invalid. Expected input to be positive.
Error in nnet.internal.cnn.dlnetwork/CodegenOptimizationStrategy/propagateWithFallback (line 122)
[varargout{1:nargout}] = fcn(net, X, layerIndices, layerOutputIndices);
Error in nnet.internal.cnn.dlnetwork/CodegenOptimizationStrategy/forward (line 62)
[varargout{1:nargout}] = propagateWithFallback(strategy, functionSlot, @forward, net, X, layerIndices, layerOutputIndices);
Error in nnet.internal.cnn.dlnetwork/DefaultOptimizationStrategy/propagate (line 143)
[varargout{1:nargout}] = inferenceMethod(strategy.CodegenStrategyOriginal,...
Error in nnet.internal.cnn.dlnetwork/DefaultOptimizationStrategy/forward (line 77)
[varargout{1:nargout}] = propagate(strategy, net, X, ...
Error in dlnetwork/forward (line 503)
[varargout{1:nargout}] = strategy.forward(net.PrivateNetwork, x, layerIndices, layerOutputIndices);
Error in networkGradients (line 21)
[YRPNRegDeltas, proposal, YRCNNClass, YRCNNReg, YRPNClass, YMask, state] = forward(...
Error in deep.internal.dlfeval (line 18)
[varargout{1:nout}] = fun(x{:});
Error in dlfeval (line 41)
[varargout{1:nout}] = deep.internal.dlfeval(fun,varargin{:});
I am wondering if there is anything I misunderstood so that the code doesn't work for me.
It will be of great help if this could be figured out or fixed. Thank you!
  3 Comments

Sign in to comment.

Accepted Answer

Yi-Ping Hsueh
Yi-Ping Hsueh on 29 Mar 2021
(copied from my previous comment to myself...)
I figured out a solution to this issue from other resource.
The problem comes from the negative value returned by "state". The original code is as below:
[gradients,loss,state] = dlfeval(@networkGradients,X,gtBox,gtClass,gtMask,dlnet,params);
dlnet.State = state;
Replace the last line (dlnet.State = state;) with the followings to ensure that all values assigned to "dlnet.State" are positive.
idx = dlnet.State.Parameter == "TrainedVariance";
boundAwayFromZero = @(X) max(X, eps('single'));
dlnet.State(idx,:) = dlupdate(boundAwayFromZero, dlnet.State(idx,:));
This will make the code work then.
But then I am now facing another problem. The training process takes so much time (days), probably because the network is really huge. I thought my GPU should be good enough but it turns out that even setting the mini-batch size to 2 requires more memory on GPU than what I have. For now, only cpu is capable of performing such computation.
My GPU is as follows:
Name: 'GeForce GTX 1080'
Index: 1
ComputeCapability: '6.1'
SupportsDouble: 1
DriverVersion: 11.2000
ToolkitVersion: 11
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [2.1475e+09 65535 65535]
SIMDWidth: 32
TotalMemory: 8.5899e+09
AvailableMemory: 7.4505e+09
MultiprocessorCount: 20
ClockRateKHz: 1771000
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 1
Hope this information helps those who want to train their own mask R-CNN on MATLAB.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by