What is Validation data in deep learning?
88 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Syed JABBAR SHAH
el 23 de Mayo de 2021
Editada: Cris LaPierre
el 12 de Abr. de 2024
Hi, I am new to deep learning and I am struggling to grasp the concept of validation and its interpretation in training progress plots.
As far I understand after some Googling, it is a check on how the model is behaving during training before deploying on test data.
My question is what is validation data and what it validation accuracy and how to interpret. I would appreciate if some help me to understand in simple words.
Best,
Jabbar
0 comentarios
Respuesta aceptada
Philip Brown
el 21 de Jun. de 2021
The "validation data" is a set of data held separate from your training data. It's used during the training process to see how the network would perform on data it hasn't been directly trained on.
The "training data" is what's used in the process of updating the layer weights via backpropagation. Training data is fed through the network every iteration, the loss is computed, and the layer weights are updated via backpropagation to reduce the loss for that iteration. If the training is going well, every iteration updates the weights of the network so it becomes better and better at predicting on the training data.
However, there's a danger the network becomes too good at predicting on the training data. It can learn very specific features of the training set, rather than generally useful features which would be helpful for predicting on new data. This is called "overfitting". To check for that, we can use "validation data". The validation data is not used directly to train the network. It's instead used to see how the network is performing. In your training plot, these validation checks happen every 50th iteration. Results from the validation data are not used to update the network weights.
As you're seeing in the training plot, the validation accuracy is a little bit below the training accuracy - the network is better at predicting on the data it's been directly trained on. This is quite common. If you saw the validation accuracy start to drop substantially as you train further, that would be more evidence for overfitting: your network would be learning specific features of the training set, rather than general features also useful for the validation set.
2 comentarios
nika mentges
el 11 de Abr. de 2024
may I ask what the diffrence between the validation data and the testing data is?
Cris LaPierre
el 11 de Abr. de 2024
Editada: Cris LaPierre
el 12 de Abr. de 2024
Validation data is used during the training process to evaulate the model. From the doc:
- "Validation estimates model performance on new data compared to the training data, and helps you choose the best model. Validation protects against overfitting."
Test data is used to evaluate the final trained model. From the doc:
- "You can use the test set to evaluate the performance of a trained model. In particular, you can check whether the validation metrics provide good estimates for the model performance on new data."
Of note, the model has been trained without ever seeing the test data, so it can help highlight issues with your trained model that the validation data does not.
Más respuestas (2)
Lei Liu
el 29 de Mayo de 2021
I'm also a newcomer to neural network learning,perhaps some of this page will help you:
https://www.mathworks.com/help/deeplearning/ug/setting-up-parameters-and-training-of-a-convnet.html?searchHighlight=overfitting&s_tid=doc_srchtitle
0 comentarios
Klara Husonuk
el 26 de Sept. de 2023
Validation accuracy is a metric that tells you how well your model is doing on this unseen data. It's like a grade on a practice test. Higher validation accuracy indicates that your model is learning and generalizing effectively, making accurate predictions on new, unseen data. Validation data and validation accuracy are essential tools in deep learning to ensure your model learns effectively and generalizes well to new situations. Monitoring this accuracy during training helps you fine-tune your model for better results.
0 comentarios
Ver también
Categorías
Más información sobre Sequence and Numeric Feature Data Workflows en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!