Why does not Matlab use the full capacity of my computer while training a neural network?

23 visualizaciones (últimos 30 días)
My goal is to train a neural network to classify objects form pictures of my webcam. I use transfer learning with Alexnet and I have a labeled training data set with 25,000 images.
My training script works perfectly, but the progress of the iterations during training is very slow. I have the Parallel Computing Toolbox installed and the training runs on the Single GPU. But when looking at the task manager, Matlab only uses 13 % of the CPU and just 2 % of the GPU. Why does not Matlab use more resources to speed up the training process?
The software is Windows 10 and I have the newest version of Matlab 64bit installed.

Respuesta aceptada

John D'Errico
John D'Errico el 28 de Feb. de 2019
Editada: John D'Errico el 28 de Feb. de 2019
I do not think the issue is single threading. Not at all.
You surely don't keep all 25000 images in memory at once. Instead, you probably use and re-use them. over and over again. So, what you probably should be doing is looking at your disk access rate. Reading an image takes relatively a lot of time, but it is not really CPU time that is consumed. The CPU spends much of its time just waiting for data. So the CPU is not shown as busy. This is just my prediction of course, as evidenced by the statistics you report.
Can you confirm this fact? Of course! You need to learn about the Profile topol that MATLAB provides. It allows you to turn the profiler on, then run your code. Then check where MATLAB was spending the most time. This is an important thing to do for ANY code. My prediction is when you do the profile on your code, it will show the most time is consumed in one line of code - probably a simple imread call.
Now, why did I recognize the issue is NOT single threading versus multi-threading?
For example, my computer has 4 cores on it. I can easily force MATLAB to access all 4 cores in an operation, because I know which operations MATLAB will tend to automatically multi-thread. Likewise, I can as easily force MATLAB to runflat out, but only on ONE thread, since again, I know that some operations are NOT automatically parallelized.
An example of the former would the multiplication of two very large matrices. MATLAB will do that on as many cores as it can get. In the latter case, a very large symbolic computation is a good example. That is a problem that appears to be often not so easily parallelized, so no matter how large, it will run in only one core.
Again, you can see this happening in a monitor that can report the activty of all of your cores. In the first case of the matrix multiply, it will show 400% core usage. (As well, my system fan will kick on to dissipate all the heat generated when that much work is being done.) But in the latter case of the single threaded symbolic computation, only one core will be seen, running flat out at 100%.
But what did you see? You saw 13%. What you probably could have looked at was your disk access rate. I'd bet that is high for the entire time your code runs.
Can you fix this? Many things are fixable through use of better coding skills. Sometimes you need to completely change your algorithm, keeping more information in memory at once, trying to avoid those repeated image loads. The problem is you don't have that much memory. Well, you might. And that could be one option. Can you afford to buy enough RAM to store them all in system memory, NOT on a hard disk?
Another option is to make sure you are using the FASTEST disk possible. SSD drives are pretty cheap these days, and they are fast.
Yes, I know that spending money on your computer may not be an option. But you asked to know what the problem is. One of the ways to solve it, if I am correct in my assessment, is to make those image reads much faster. Or, you can find a way to write better, more efficient code. No matter what, that will start with the profile utility in MATLAB.
And, when all else fails, you can recognize that big problems take big time. You can reduce the time needed by solving smaller problems. Or you can change the algorithms you use so that computation is more efficient.
  1 comentario
Christoph Müßig
Christoph Müßig el 2 de Mzo. de 2019
Thank you very much for taking the time to discuss my problem in detail. I also had the assumption that the limitation lies in loading the pictures, which have to be imported again and again into Matlab for the training. But when I look in the Task Manager, the utilization of the hard disk is also very low. I have a Samsung 960 evo 250gb, which should be fast enough.
I speeded up the process by reducing the import size with feature extraction and while it does not take full advantage of the computer's resources, it is now fast enough to complete the training in a few days.

Iniciar sesión para comentar.

Más respuestas (1)

Munish Raj
Munish Raj el 28 de Feb. de 2019
Please have a look at this answer.

Categorías

Más información sobre Parallel and Cloud en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by