Training a CNN model with Numerical Data for Binary Classification

Question

Emmanuel el 10 de Jul. de 2023

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/1994033-training-a-cnn-model-with-numerical-data-for-binary-classification

Editada: Emmanuel el 12 de Jul. de 2023

I want to train a CNN Model for Binary classification on numeric datasets extracted from level (1-4) and the approximate coefficient level of Discrete Wavelet Transform decomposition. The data have been partitioned into Training, Validation and Test sets, and stored as seperate CSV file format with corresponding labels.

Input Data reshaped as (4D double):

Train data size is : 5x30660

Test size: 5x6570

Validation size: 5x6570

While the responses(categorical) are:

XTrain label size: 5x30660,

XTest label size:5x6570,

XValidation label size: 5x6570. I used the imageInputLayer as input and Convolution1DLayer as seen in the code provided.

This is the Error message from this code:

"Error in another (line 107)

net = trainNetwork(XTrain, categorical(YTrain), layers, options);

Caused by:

Layer 'Conv_1': Input data must have one spatial dimension only, one temporal dimension only, or one of each. Instead, it

has 2 spatial dimensions and 0 temporal dimensions."

This is the code:

% Initializing empty arrays for data and labels

allTrainData = cell(1, 5);

allTrainLabels = cell(1, 5);

allValidationData = cell(1, 5);

allValidationLabels = cell(1, 5);

allTestData = cell(1, 5);

allTestLabels = cell(1, 5);

% Loading and concatenating the training datasets

trainDataFiles = ["activetrain.csv", "ambienttrain.csv", "generatedtrain.csv", "moduletrain.csv", "radiationtrain.csv"];

trainLabelFiles = ["labeltrainactive.csv", "labeltrainambient.csv", "labeltraingenerated.csv", "labeltrainmodule.csv", "labeltrainradiation.csv"];

for i = 1:5

trainData = load(trainDataFiles(i));

trainLabels = load(trainLabelFiles(i));

% Extracting the numeric arrays from the structure arrays

allTrainData{i} = trainData; % Store the numeric arrays directly

allTrainLabels{i} = trainLabels;

end

% Loading and concatenating the validation datasets

validationDataFiles = ["activevalid.csv", "ambientvalid.csv", "generatedvalid.csv", "modulevalid.csv", "radiationvalid.csv"];

validationLabelFiles = ["labelvalidactive.csv", "labelvalidambient.csv", "labelvalidgenerated.csv", "labelvalidmodule.csv", "labelvalidradiation.csv"];

for i = 1:5

validationData = load(validationDataFiles(i));

validationLabels = load(validationLabelFiles(i));

% Extracting the numeric arrays from the structure arrays

allValidationData{i} = validationData; % Store the numeric arrays directly

allValidationLabels{i} = validationLabels;

end

% Loading and concatenating the test datasets

testDataFiles = ["activetest.csv", "ambienttest.csv", "generatedtest.csv", "moduletest.csv", "radiationtest.csv"];

testLabelFiles = ["labeltestactive.csv", "labeltestambient.csv", "labeltestgenerated.csv", "labeltestmodule.csv", "labeltestradiation.csv"];

for i = 1:5

testData = load(testDataFiles(i));

testLabels = load(testLabelFiles(i));

% Extract the numeric arrays from the structure arrays

allTestData{i} = testData; % Store the numeric arrays directly

allTestLabels{i} = testLabels;

end

% Reshaping the input data to 4D tensor: [height, width, channels, samples]

inputHeight = 1;

inputWidth = 5; %length of your input data

numChannels = 5; % 5 coefficient levels

numTrainSamples = size(allTrainData{i}, 2);

numTestSamples = size(allTestData{i}, 2);

numValidationSamples = size( allValidationData{i}, 2);

XTrain = reshape( allTrainData{i}, inputHeight, inputWidth, numChannels, numTrainSamples);

XTest = reshape( allTestData{i}, inputHeight, inputWidth, numChannels, numTestSamples);

XValidation = reshape( allValidationData{i}, inputHeight, inputWidth, numChannels, numValidationSamples);

% Normalizing the input data

XTrain = normalize(XTrain);

XTest = normalize(XTest);

XValidation = normalize(XValidation);

% Converting the labels to categorical format

YTrain = categorical(cell2mat(allTrainLabels));

YTest = categorical(cell2mat(allTestLabels));

YValidation = categorical(cell2mat(allValidationLabels));

% Defining the CNN architecture

layers = [

imageInputLayer([1 5 5],"Name","Input","Normalization","zscore")

convolution1dLayer(3,8,"Name","Conv_1","Padding","same")

batchNormalizationLayer("Name","Bnorm")

reluLayer("Name","relu_1")

maxPooling1dLayer(2, "Padding", "same", "Stride", 2)

convolution1dLayer(3,16,"Name","Conv_2","Padding","same")

batchNormalizationLayer("Name","Bnorm_2")

reluLayer("Name","relu_2")

maxPooling1dLayer(2,"Padding","same","Stride",2)

convolution1dLayer(3,32,"Name","Conv_3","Padding","same")

batchNormalizationLayer("Name","Bnorn_3")

reluLayer("Name","relu_3")

maxPooling1dLayer(2,"Padding","same","Stride",2)

convolution2dLayer(3,64,"Name","Conv_4","Padding","same")

batchNormalizationLayer("Name","BNorm_4")

reluLayer("Name","relu_4")

fullyConnectedLayer(8,"Name","FC_1","WeightLearnRateFactor",0.01)

reluLayer("Name","relu_5")

fullyConnectedLayer(2,"Name","FC_2","WeightLearnRateFactor",0.01)

softmaxLayer("Name","Softmax_layer")

classificationLayer("Classes","auto")];

plot(layerGraph(layers));

% Setting the training options

options = trainingOptions('adam', ...

'InitialLearnRate', 0.01, ...

'MaxEpochs', 5, ...

'MiniBatchSize', 32, ...

'ValidationData', {XValidation, YValidation}, ...

'ValidationFrequency', 10, ...

'Verbose', true, ...

'Plots', 'training-progress');

% Training the CNN model

net = trainNetwork(XTrain, categorical(YTrain), layers, options);

% Perform anomaly detection on the test dataset

YTestPred = classify(net, XTest);

% Evaluating the performance

accuracy = sum(YTestPred == YTest) / numel(YTest);

disp(['Test Accuracy: ', num2str(accuracy)]);

Kindly share your thought. Thanks

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

ProblemSolver el 10 de Jul. de 2023

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/1994033-training-a-cnn-model-with-numerical-data-for-binary-classification#answer_1270468

Editada: ProblemSolver el 10 de Jul. de 2023

Abrir en MATLAB Online

Hello @Emmanuel:

There are couple of things that you have overlooked and therefore causing the error issues:

Since you are working with the numerical data only, I would suggest changing from "cell" to "zeros" to optimize the code:

allTrainData = zeros(5, 30660);
allTrainLabels = zeros(5, 30660);
allValidationData = zeros(5, 6570);
allValidationLabels = zeros(5, 6570);
allTestData = zeros(5, 6570);
allTestLabels = zeros(5, 6570);

You have not properly concatenated the loaded data into your variables "allTrainData" and "allTrainLabels". Therefore, instead of creating separate arrays for each dataset, you can do is combine them in a single loop such as:

trainDataFiles = ["activetrain.csv", "ambienttrain.csv", "generatedtrain.csv", "moduletrain.csv", "radiationtrain.csv"];
trainLabelFiles = ["labeltrainactive.csv", "labeltrainambient.csv", "labeltraingenerated.csv", "labeltrainmodule.csv", "labeltrainradiation.csv"];
for i = 1:5
    trainData = load(trainDataFiles(i));
    trainLabels = load(trainLabelFiles(i));
    
    allTrainData(i, :) = trainData(:)';
    allTrainLabels(i, :) = trainLabels(:)';
end

The dimmensions of your reshaping the data input is wrong:

inputHeight = 1;
inputWidth = 5; % length of your input data
numChannels = 5; % 5 coefficient levels
numTrainSamples = size(allTrainData, 2);
numTestSamples = size(allTestData, 2);
numValidationSamples = size(allValidationData, 2);
XTrain = reshape(allTrainData, inputHeight, inputWidth, numChannels, numTrainSamples);
XTest = reshape(allTestData, inputHeight, inputWidth, numChannels, numTestSamples);
XValidation = reshape(allValidationData, inputHeight, inputWidth, numChannels, numValidationSamples);

You original code did account for normalizing the dataset. I am not sure if it is required or not, but I use it for my data set:

XTrain = normalize(XTrain);
XTest = normalize(XTest);
XValidation = normalize(XValidation);

Finally, I see that the error generated is because that your input shape of the 'imageInputLayer' is incorrect, therefore, adjust that based on the reshaped input data something like this:

layers = [
    imageInputLayer([inputHeight, inputWidth, numChannels], "Name", "Input", "Normalization", "zscore")
    % Rest of the layers...
];

I hope these suggestions helps you to solve the error.

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Emmanuel el 11 de Jul. de 2023

Editada: Emmanuel el 11 de Jul. de 2023

Abrir en MATLAB Online

@ProblemSolver, Thank you for your detailed response and input. It is highly appreciated.

However, I got the error below from this Reshaping:

XTrain = reshape(allTrainData, inputHeight, inputWidth, numChannels, numTrainSamples);
XTest = reshape(allTestData, inputHeight, inputWidth, numChannels, numTestSamples);
XValidation = reshape(allValidationData, inputHeight, inputWidth, numChannels, numValidationSamples);

Error using reshape

Number of elements must not change. Use [ ] as one of the size inputs to automatically calculate the appropriate size for that dimension.

Error in (line 60)

XTrain = reshape(allTrainData, inputHeight, inputWidth, numChannels, numTrainSamples)

When I tried doing this, it went back to getting the error below:

XTrain = reshape(allTrainData, inputHeight, inputWidth, numChannels, []);
XTest = reshape(allTestData, inputHeight, inputWidth, numChannels, []);
XValidation = reshape(allValidationData, inputHeight, inputWidth, numChannels, []);

Error in (line 117)

net = trainNetwork(XTrain, categorical(YTrain), layers, options);

Caused by:

Layer 'Conv_1': Input data must have one spatial dimension only, one temporal dimension only, or one of each. Instead, it has 2 spatial dimensions and 0 temporal dimensions.

ProblemSolver el 11 de Jul. de 2023

@Emmanuel: I need to check your .csv files. You have to send me some base data that I know what I am dealing with. The structure of the tables and all.

Emmanuel el 11 de Jul. de 2023

Editada: Emmanuel el 12 de Jul. de 2023

@ProblemSolver, I was able to find a way around it using this:

% Reshaping the input data to 4D tensor: [height, width, channels, samples]

inputHeight = 1;

inputWidth = 1;

numChannels = 1;

numTrainSamples = size(allTrainData, 2);

numTestSamples = size(allTestData, 2);

numValidationSamples = size(allValidationData, 2);

XTrain = reshape(allTrainData, inputHeight, inputWidth, numChannels, []);

XValidation = reshape(allValidationData, inputHeight, inputWidth, numChannels, []);

% Normalize Input Data.......

% Convert response to categorical vector........

% Define the CNN architecture

layers = [

imageInputLayer([inputHeight, inputWidth, numChannels], "Name", "Input", "Normalization", "zscore")

Thank you very much for your initial response.

Iniciar sesión para comentar.

Training a CNN model with Numerical Data for Binary Classification

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Más respuestas (0)

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

Training a CNN model with Numerical Data for Binary Classification

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

3 comentarios Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Más respuestas (0)

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo