Borrar filtros
Borrar filtros

Different training results for neural network when using full dataset versus partial dataset

2 visualizaciones (últimos 30 días)
I'm training a network using 'narxnet' and 'train'.
My training data is a part of a larger dataset. These are the two scenarios in which I get different results.
  1. Trim the dataset so the entire input data is the training data. 'trainInd' = the entire dataset; no validation or test indices are provided
  2. Use the entire dataset, but specify the training data by 'trainInd' (using the indices of the exact data from scenario 1); no validation or test indices are provided
The training terminates at the same conditions, and I'm using the same dataset, but I get different results. I've also experimented with adjusting the training data indices in scenario 2 based on # of delays specified with no luck.
Does anyone have any insight ino what might be causing this? (I'm aware with the issues of not specifying validation data, I'm just trying to replaicate behavior at the moment).

Respuesta aceptada

Mrutyunjaya Hiremath
Mrutyunjaya Hiremath el 21 de Jul. de 2023
The difference in results between scenario 1 and scenario 2 could be due to the different order of data samples seen during training. When you trim the dataset so that the entire input data is used for training (scenario 1), the network sees the data in the same order as it appears in the dataset. However, when you specify the training data using the indices (scenario 2), the network sees the data in a different order based on the selected indices.
In a neural network, the order in which data samples are presented during training can have an impact on the convergence and final performance of the model. Different orders of data samples can lead to different weight updates during training, potentially resulting in slightly different results.
To address this issue and ensure more consistent results, you can try the following:
  1. Shuffle the dataset: Before creating the neural network and specifying the trainInd in scenario 2, shuffle the entire dataset randomly. This will help to randomize the order of data samples and potentially lead to more consistent training.
  2. Set the random seed: If you are using a random number generator during training (e.g., weight initialization or mini-batch shuffling), set a fixed random seed before running both scenarios. This ensures that the randomization process during training is the same for both scenarios, leading to more reproducible results.
By shuffling the dataset and setting the random seed, you should get more consistent results between scenario 1 and scenario 2. Keep in mind that neural networks are still sensitive to other factors such as network architecture, learning rate, and training parameters, so it's possible to see slight differences even with these measures in place. However, the consistency should be improved.
  4 comentarios

Iniciar sesión para comentar.

Más respuestas (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by