Why is the CNN predict function faster when running a set of images from ImageDataStore compared to running each image individually?

11 visualizaciones (últimos 30 días)
I am trying this example code
One thing I notice is that running the Classify function with the imageDataStore (imdsValidation)is much faster than running one image multiple times. Is this some sort of batch process that is using Matlab vectorization to speed things up or is it inherent to a CNN?
YPred = classify(net,imdsValidation);
And a related question, when doing codegen or CNNcodegen, is the resulting C++ code able to also run multiple images like this? I cannot see a way to do it with the C++ output code.

Respuestas (2)

Vineet Joshi
Vineet Joshi el 20 de Jul. de 2021
Hi
As you can see in the documentation page classify - MiniBatchSize, larger mini-batches can result in faster prediction but requires more memory and hence it is not something specific to the CNN network.
As for your second question, the CNNcodegen function only generates the codes for the network, how you inference it depends on your choice. You can write the code to sequencially inference the network and get the C++ code, or use other techniques like multiple workers and parallel computing to make it faster in a batch setting.
Hope this was helpful.
Thanks
  1 comentario
Eric Louchard
Eric Louchard el 20 de Jul. de 2021
Thanks for the reply! I have been doing some more experiments and found that making a 4D array of images takes advantage of some parallel processing system in Matlab. I have another question once I describe the findings.
~~~~~~~~~~~~~~
I made a 64-deep block of images into a 4D datacube and compared running it verses running one image 64 times and the results were the 4D datacube process was around 10x faster.
tic
for loop = 1:64
score = Fastnet_LWIR.predict(imLWIR);
end
toc
tic
score = Fastnet_LWIR.predict(im4D);
toc
Elapsed time is 0.212989 seconds.
Elapsed time is 0.023708 seconds.
So, knowing this, I tried to make a mex file using codegen and a simple prediction function and used args {ones(64,64,1,64,'uint8')} for a 64 deep 4D datacube.
cfg = coder.gpuConfig('mex');
cfg.TargetLang = 'C++';
cfg.DeepLearningConfig = coder.DeepLearningConfig('cudnn');
codegen -config cfg Fastnet_LWIR_predict -args {ones(64,64,1,64,'uint8')} -report
This resulted in a mex file that took in the 4D datacube as input only, not a single frame, but had the same type of speed improvement.
However, in trying to make a C++ dll or lib, I kept getting errors so I tried cnncodegen instead and it worked, but I think it is only a single call to predict and not doing any sort of parallel processing. I tried 'batchsize', 64 in the call to cnncodegen below.
cnncodegen(Fastnet_LWIR,'targetlib','cudnn','ComputeCapability','6.1','targetparams',struct('AutoTuning',true,'DataType','FP32'),'batchsize',64,'codegenonly',1)
~~~~~~~~~~~~~~~~~ now to the question
Is there a way to call cnncodegen to write C++ code and have it work on 4D datacubes, taking advantage of prallel processing?
This is the error I got when trying to use codegen and 'lib'
cfg = coder.gpuConfig('lib');
cfg.TargetLang = 'C++';
cfg.DeepLearningConfig = coder.DeepLearningConfig('cudnn');
codegen -config cfg Fastnet_LWIR_predict -args {ones(64,64,1,64,'uint8')} -report
**********************************************************************
** Visual Studio 2017 Developer Command Prompt v15.5.0
** Copyright (c) 2017 Microsoft Corporation
**********************************************************************
[vcvarsall.bat] Environment initialized for: 'x64'
Microsoft (R) Program Maintenance Utility Version 14.12.25830.2
Copyright (C) Microsoft Corporation. All rights reserved.
nvcc -c -Xcompiler "/wd 4819" -Xcompiler "/MD" -rdc=true -Xcudafe "--display_error_number --diag_suppress=unsigned_compare_with_zero" -O3 -arch sm_35 -D MW_CUDA_ARCH=350 -D BUILDING_TEST_CNN_PREDICTOR -D MODEL=test_CNN_predictor -D MODEL=test_CNN_predictor -o "MWElementwiseAffineLayer.obj" "D:\Fastnet_LWIR\codegen\dll\test_CNN_predictor\MWElementwiseAffineLayer.cpp"
MWElementwiseAffineLayer.cpp
C:\EngTools\Microsoft Visual Studio\2017\Professional\VC\Tools\MSVC\14.12.25827\include\crtdefs.h(10): fatal error C1083: Cannot open include file: 'corecrt.h': No such file or directory
NMAKE : fatal error U1077: '"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\bin\nvcc.EXE"' : return code '0x2'
Stop.
The make command returned an error of 2
Error(s) encountered while building "test_CNN_predictor":
### Failed to generate all binary outputs.
------------------------------------------------------------------------
??? Build error: C++ compiler produced errors. See the Build Log for further details.
More information

Iniciar sesión para comentar.


Eric Louchard
Eric Louchard el 21 de Jul. de 2021
When I test codegen, I now get this
clear cfg
cfg = coder.gpuConfig('lib');
cfg.TargetLang = 'C++';
cfg.DeepLearningConfig = coder.DeepLearningConfig('cudnn');
codegen -config cfg Fastnet_LWIR_predict -args {ones(64,64,1,'uint8')} -report
Warning: Validation warning(s):
The following macro(s) in the build configuration options were not found in both the declared list of toolchain macros and
standard code-generation macros:
conlibs
If the above macro(s) is not defined at the point when the makefile is invoked, the build may fail.
> In coder.make.ToolchainInfo/validate
In coder.make.invokeBuilder
In coder.make.invokeBuilder
In RTW.genMakefileAndBuild (line 458)
In coder.internal.doCompile
In emcBuildRTW
In emcBuildRTW
In emcGenMakefileAndBuild
In emcGenMakefileAndBuild
In emcBuildTarget
In emlcprivate
In coder.internal.compile
In emlckernel
In emlckernel
In emlckernel
In emlcprivate
In codegen
------------------------------------------------------------------------
**********************************************************************
** Visual Studio 2017 Developer Command Prompt v15.5.0
** Copyright (c) 2017 Microsoft Corporation
**********************************************************************
[vcvarsall.bat] Environment initialized for: 'x64'
Microsoft (R) Program Maintenance Utility Version 14.12.25830.2
Copyright (C) Microsoft Corporation. All rights reserved.
nvcc -c -Xcompiler "/wd 4819" -Xcompiler "/MD" -rdc=true -Xcudafe "--display_error_number --diag_suppress=unsigned_compare_with_zero" -O3 -arch sm_35 -D MW_CUDA_ARCH=350 -D MODEL=Fastnet_LWIR_predict -D MODEL=Fastnet_LWIR_predict -o "MWElementwiseAffineLayer.obj" "D:\Fastnet_LWIR\codegen\lib\Fastnet_LWIR_predict\MWElementwiseAffineLayer.cpp"
MWElementwiseAffineLayer.cpp
nvcc -c -Xcompiler "/wd 4819" -Xcompiler "/MD" -rdc=true -Xcudafe "--display_error_number --diag_suppress=unsigned_compare_with_zero" -O3 -arch sm_35 -D MW_CUDA_ARCH=350 -D MODEL=Fastnet_LWIR_predict -D MODEL=Fastnet_LWIR_predict -o "MWFusedConvReLULayer.obj" "D:\Fastnet_LWIR\codegen\lib\Fastnet_LWIR_predict\MWFusedConvReLULayer.cpp"
MWFusedConvReLULayer.cpp
nvcc -c -Xcompiler "/wd 4819" -Xcompiler "/MD" -rdc=true -Xcudafe "--display_error_number --diag_suppress=unsigned_compare_with_zero" -O3 -arch sm_35 -D MW_CUDA_ARCH=350 -D MODEL=Fastnet_LWIR_predict -D MODEL=Fastnet_LWIR_predict -o "cnn_api.obj" "D:\Fastnet_LWIR\codegen\lib\Fastnet_LWIR_predict\cnn_api.cpp"
cnn_api.cpp
nvcc -c -Xcompiler "/wd 4819" -Xcompiler "/MD" -rdc=true -Xcudafe "--display_error_number --diag_suppress=unsigned_compare_with_zero" -O3 -arch sm_35 -D MW_CUDA_ARCH=350 -D MODEL=Fastnet_LWIR_predict -D MODEL=Fastnet_LWIR_predict -o "MWCNNLayerImpl.obj" "D:\Fastnet_LWIR\codegen\lib\Fastnet_LWIR_predict\MWCNNLayerImpl.cu"
MWCNNLayerImpl.cu
nvcc -c -Xcompiler "/wd 4819" -Xcompiler "/MD" -rdc=true -Xcudafe "--display_error_number --diag_suppress=unsigned_compare_with_zero" -O3 -arch sm_35 -D MW_CUDA_ARCH=350 -D MODEL=Fastnet_LWIR_predict -D MODEL=Fastnet_LWIR_predict -o "MWElementwiseAffineLayerImpl.obj" "D:\Fastnet_LWIR\codegen\lib\Fastnet_LWIR_predict\MWElementwiseAffineLayerImpl.cu"
MWElementwiseAffineLayerImpl.cu
nvcc -c -Xcompiler "/wd 4819" -Xcompiler "/MD" -rdc=true -Xcudafe "--display_error_number --diag_suppress=unsigned_compare_with_zero" -O3 -arch sm_35 -D MW_CUDA_ARCH=350 -D MODEL=Fastnet_LWIR_predict -D MODEL=Fastnet_LWIR_predict -o "MWElementwiseAffineLayerImplKernel.obj" "D:\Fastnet_LWIR\codegen\lib\Fastnet_LWIR_predict\MWElementwiseAffineLayerImplKernel.cu"
MWElementwiseAffineLayerImplKernel.cu
nvcc -c -Xcompiler "/wd 4819" -Xcompiler "/MD" -rdc=true -Xcudafe "--display_error_number --diag_suppress=unsigned_compare_with_zero" -O3 -arch sm_35 -D MW_CUDA_ARCH=350 -D MODEL=Fastnet_LWIR_predict -D MODEL=Fastnet_LWIR_predict -o "MWFusedConvReLULayerImpl.obj" "D:\Fastnet_LWIR\codegen\lib\Fastnet_LWIR_predict\MWFusedConvReLULayerImpl.cu"
MWFusedConvReLULayerImpl.cu
nvcc -c -Xcompiler "/wd 4819" -Xcompiler "/MD" -rdc=true -Xcudafe "--display_error_number --diag_suppress=unsigned_compare_with_zero" -O3 -arch sm_35 -D MW_CUDA_ARCH=350 -D MODEL=Fastnet_LWIR_predict -D MODEL=Fastnet_LWIR_predict -o "MWTargetNetworkImpl.obj" "D:\Fastnet_LWIR\codegen\lib\Fastnet_LWIR_predict\MWTargetNetworkImpl.cu"
MWTargetNetworkImpl.cu
nvcc -c -Xcompiler "/wd 4819" -Xcompiler "/MD" -rdc=true -Xcudafe "--display_error_number --diag_suppress=unsigned_compare_with_zero" -O3 -arch sm_35 -D MW_CUDA_ARCH=350 -D MODEL=Fastnet_LWIR_predict -D MODEL=Fastnet_LWIR_predict -o "Fastnet_LWIR_predict_rtwutil.obj" D:\Fastnet_LWIR\codegen\lib\Fastnet_LWIR_predict\Fastnet_LWIR_predict_rtwutil.cu
Fastnet_LWIR_predict_rtwutil.cu
nvcc -c -Xcompiler "/wd 4819" -Xcompiler "/MD" -rdc=true -Xcudafe "--display_error_number --diag_suppress=unsigned_compare_with_zero" -O3 -arch sm_35 -D MW_CUDA_ARCH=350 -D MODEL=Fastnet_LWIR_predict -D MODEL=Fastnet_LWIR_predict -o "Fastnet_LWIR_predict_data.obj" D:\Fastnet_LWIR\codegen\lib\Fastnet_LWIR_predict\Fastnet_LWIR_predict_data.cu
Fastnet_LWIR_predict_data.cu
nvcc -c -Xcompiler "/wd 4819" -Xcompiler "/MD" -rdc=true -Xcudafe "--display_error_number --diag_suppress=unsigned_compare_with_zero" -O3 -arch sm_35 -D MW_CUDA_ARCH=350 -D MODEL=Fastnet_LWIR_predict -D MODEL=Fastnet_LWIR_predict -o "Fastnet_LWIR_predict_initialize.obj" D:\Fastnet_LWIR\codegen\lib\Fastnet_LWIR_predict\Fastnet_LWIR_predict_initialize.cu
Fastnet_LWIR_predict_initialize.cu
nvcc -c -Xcompiler "/wd 4819" -Xcompiler "/MD" -rdc=true -Xcudafe "--display_error_number --diag_suppress=unsigned_compare_with_zero" -O3 -arch sm_35 -D MW_CUDA_ARCH=350 -D MODEL=Fastnet_LWIR_predict -D MODEL=Fastnet_LWIR_predict -o "Fastnet_LWIR_predict_terminate.obj" D:\Fastnet_LWIR\codegen\lib\Fastnet_LWIR_predict\Fastnet_LWIR_predict_terminate.cu
Fastnet_LWIR_predict_terminate.cu
nvcc -c -Xcompiler "/wd 4819" -Xcompiler "/MD" -rdc=true -Xcudafe "--display_error_number --diag_suppress=unsigned_compare_with_zero" -O3 -arch sm_35 -D MW_CUDA_ARCH=350 -D MODEL=Fastnet_LWIR_predict -D MODEL=Fastnet_LWIR_predict -o "Fastnet_LWIR_predict.obj" D:\Fastnet_LWIR\codegen\lib\Fastnet_LWIR_predict\Fastnet_LWIR_predict.cu
Fastnet_LWIR_predict.cu
nvcc -c -Xcompiler "/wd 4819" -Xcompiler "/MD" -rdc=true -Xcudafe "--display_error_number --diag_suppress=unsigned_compare_with_zero" -O3 -arch sm_35 -D MW_CUDA_ARCH=350 -D MODEL=Fastnet_LWIR_predict -D MODEL=Fastnet_LWIR_predict -o "DeepLearningNetwork.obj" D:\Fastnet_LWIR\codegen\lib\Fastnet_LWIR_predict\DeepLearningNetwork.cu
DeepLearningNetwork.cu
nvcc -c -Xcompiler "/wd 4819" -Xcompiler "/MD" -rdc=true -Xcudafe "--display_error_number --diag_suppress=unsigned_compare_with_zero" -O3 -arch sm_35 -D MW_CUDA_ARCH=350 -D MODEL=Fastnet_LWIR_predict -D MODEL=Fastnet_LWIR_predict -o "predict.obj" D:\Fastnet_LWIR\codegen\lib\Fastnet_LWIR_predict\predict.cu
predict.cu
nvcc -c -Xcompiler "/wd 4819" -Xcompiler "/MD" -rdc=true -Xcudafe "--display_error_number --diag_suppress=unsigned_compare_with_zero" -O3 -arch sm_35 -D MW_CUDA_ARCH=350 -D MODEL=Fastnet_LWIR_predict -D MODEL=Fastnet_LWIR_predict -o "MWCudaDimUtility.obj" "D:\Fastnet_LWIR\codegen\lib\Fastnet_LWIR_predict\MWCudaDimUtility.cu"
MWCudaDimUtility.cu
'cmd' is not recognized as an internal or external command,
operable program or batch file.
NMAKE : fatal error U1077: 'cmd' : return code '0x1'
Stop.
The make command returned an error of 2
Error(s) encountered while building "Fastnet_LWIR_predict":
### Failed to generate all binary outputs.
------------------------------------------------------------------------
??? Build error: C++ compiler produced errors. See the Build Log for further details.
More information
Code generation failed: View Error Report
Error using codegen

Categorías

Más información sobre Deep Learning with GPU Coder en Help Center y File Exchange.

Productos


Versión

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by