(copied from my previous comment to myself...)
I figured out a solution to this issue from other resource.
The problem comes from the negative value returned by "state". The original code is as below:
[gradients,loss,state] = dlfeval(@networkGradients,X,gtBox,gtClass,gtMask,dlnet,params);
dlnet.State = state;
Replace the last line (dlnet.State = state;) with the followings to ensure that all values assigned to "dlnet.State" are positive.
idx = dlnet.State.Parameter == "TrainedVariance";
boundAwayFromZero = @(X) max(X, eps('single'));
dlnet.State(idx,:) = dlupdate(boundAwayFromZero, dlnet.State(idx,:));
This will make the code work then.
But then I am now facing another problem. The training process takes so much time (days), probably because the network is really huge. I thought my GPU should be good enough but it turns out that even setting the mini-batch size to 2 requires more memory on GPU than what I have. For now, only cpu is capable of performing such computation.
My GPU is as follows:
Name: 'GeForce GTX 1080'
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [2.1475e+09 65535 65535]
Hope this information helps those who want to train their own mask R-CNN on MATLAB.