Improve GPU utilization during regression deep learning

Question

0 votos

I'm having trouble improving GPU utilization on, I think, a fairly straightforward deep learning example, and wonder if there is anything clearly being done incorrectly - I'm not an expert on this field, and so am not quite sure exactly what information is most relevant to provide.

I'm using a 3090 GPU, the actual neural net architecture is a few fully-connected layers, each with ~100 neurons. The input data is a featureInput with 3 inputs, and ~20k points, going to one regression output.

The relatively sparse training options are as follows:

options = trainingOptions("adam", ...
                        MaxEpochs=500, ...
                        Shuffle="every-epoch", ...
                        InitialLearnRate=0.001,...
                        MiniBatchSize=128);

However, when I train the network, I only reach ~10% gpu utilization. I'm assuming that somehow I'm either being bottlenecked by some other step of the process.

My goal ultimately is actually to train the model ~100s of times, each with different choices of initial data. So in that sense, though my input data is relatively small (which perhaps is leading to a bottleneck?), I'm hoping to find some way to paralellize multiple trainings on the same gpu. Is this possible, or is there some other thing I've clearly overlooked when it comes to improving the utilization?

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Joss Knight el 12 de Abr. de 2023

What is your data? What does the MATLAB Profiler say about where time is being spent? Have you tried to maximize the MiniBatchSize to improve throughput?

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Iniciar sesión para seguir la actividad

Answer 1

Aishwarya Shukla el 2 de Mayo de 2023

0 votos

Hi @Adam Shaw

It's hard to say exactly what's causing the low GPU utilization without more information, but here are a few potential issues to consider:

Batch size: With a mini-batch size of 128, it's possible that your GPU is underutilized because the batches are too small to fully occupy the GPU. You could try increasing the batch size to see if that improves GPU utilization.
Data loading: If your data loading process is slow, then the GPU may be waiting for data to arrive during training, leading to low utilization. Consider using data augmentation techniques or pre-loading your data onto the GPU to improve data loading performance.
Model complexity: Your neural network may not be complex enough to fully utilize the GPU. Consider adding more layers or increasing the number of neurons per layer to see if that improves GPU utilization.
Other system constraints: It's possible that your GPU is being bottlenecked by other system constraints, such as CPU or memory bandwidth. You can monitor these metrics during training to see if they are limiting GPU utilization.

Regarding parallel training, it is possible to train multiple models simultaneously on the same GPU using parallel computing libraries such as PyTorch's DistributedDataParallel or TensorFlow's MirroredStrategy. However, keep in mind that training multiple models on the same GPU will increase memory usage, potentially leading to memory errors or slower training times.

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Joss Knight el 7 de Mayo de 2023

Or perhaps, since you're using MATLAB not python, use MATLAB to train multiple models such as described in our documentation .

Even better use the App Experiment Manager which is specifically designed to help with this.

Iniciar sesión para comentar.

Improve GPU utilization during regression deep learning

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Respuestas (1)

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Categorías

Etiquetas

Community Treasure Hunt

Improve GPU utilization during regression deep learning

1 comentario Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Respuestas (1)

1 comentario Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

Categorías

Etiquetas

Ver también

Community Treasure Hunt

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos

1 comentario
Mostrar -1 comentarios más antiguos Ocultar -1 comentarios más antiguos