Test Size and Prediction Size Difference

3 visualizaciones (últimos 30 días)
Murat Can
Murat Can el 16 de Dic. de 2017
Respondida: Sourabh el 19 de Feb. de 2025
Greetings,
I've been working on a Time Series Prediction program in Matlab. My question is about the difference between the number of rows in my test data and the predicted data.
For my program, I divided my data set as the Training Data (1200 rows) and the Testing Data (30 rows), as a result, I expected to see a total amount of 30 rows as the predicted data, but I only received data of 5 rows, as the newly predicted data. (Or similarly, for another trial I kept my Test Size in 230 rows and received 195 rows as predicted data, etc. )
Here is the piece from my code:
%CREATE THE NETWORK
net = narnet(feedbackDelays,hiddenLayerSize,'open',trainFcn);
[x_tr,xi_tr,ai_tr,t_tr] = preparets(net,{},{},trainSeries);
net.divideFcn = 'divideblock';
net.divideParam.trainRatio = 100/100;
net.performFcn = 'mse';
net.plotFcns = {'plotperform','plottrainstate', 'ploterrhist', ...
'plotregression', 'plotresponse', 'ploterrcorr', 'plotinerrcorr'};
assignin('base','hiddenLayerSize',hiddenLayerSize);
%TRAIN THE NETWORK
[net,tr] = train(net,x_tr,t_tr,xi_tr,ai_tr);
y_tr = net(x_tr,xi_tr,ai_tr);
% TEST THE NETWORK
[x_ts,xi_ts,ai_ts,t_ts] = preparets(net,{},{},testSeries);
y_ts = net(x_ts,xi_ts,ai_ts);
assignin('base','t_ts',t_ts);
assignin('base','y_ts',y_ts);
e_ts = gsubtract(t_ts,y_ts);
%%CALCULATE MAPE
mat_e_ts = cell2mat(e_ts);
mat_t_ts = cell2mat(t_ts);
thePredictMape = mean(abs(mat_e_ts./mat_t_ts))*100;
assignin('base','thePredictMape',thePredictMape);
trainSeries is my 1200 rows of Training Data, testSeries is my 30 rows of Testing Data and y_ts is the predicted data, the cell array of 5 rows, which I was expecting to see as 30 rows.
So can you please explain the reason of this difference? I'm sure the error rate I calculate is still relevant, but I wanna be sure about whether I'm doing something wrong or it's just the way Matlab does prediction?
Have a nice day
Murat

Respuestas (1)

Sourabh
Sourabh el 19 de Feb. de 2025
The issue with the discrepancy between the number of rows in your test data and the predicted data is related to the nature of the NAR neural network you are using in MATLAB.
The NAR network predicts the next value in the time series based on a specified number of past values, defined by the feedbackDelays parameter. In other words, to make a prediction, the network needs a certain number of previous time values as input.
For example, if you set feedbackDelays = 1:5, the network needs the last 5 values to predict the next one. Therefore:
  • If your test data has 30 rows, the first 5 values are used as input, and the prediction starts from the 6th value.
  • As a result, you end up with 30 - 5 = 25 predicted values.
  • In your case, if you're seeing only 5 predicted values, it likely means your feedbackDelays is set to a value that requires most of the test data just to start predicting.
To solve this issue, follow the debugging steps:
  1. Check your feedbackDelays parameter. Reduce it to increase the number of predictions.
  2. Adjust your test data size to account for the initial delays required.
  3. Consider padding the initial input if necessary to make more predictions.
For more information on “narnet”, kindly refer the following MATLAB documentation:

Categorías

Más información sobre Deep Learning Toolbox en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by