How to provide input without datastore to multiple input deep neural network?

Question

Shilpa Sonawane el 11 de Abr. de 2023

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/1944889-how-to-provide-input-without-datastore-to-multiple-input-deep-neural-network

Comentada: Shilpa Sonawane el 14 de Abr. de 2023

I have used the network shown in fig which takes 2 inputs namely video input(no. of images) & second is mfcc of audio signal of same image. I have used fileDatastore commands to store training data and validation data. Would you please guide how to provide training and validation data without filestore? I already have data in 4-D array.

Please provide solution

My aim is to generate mfcc from lip images. i have trained network with lip images & corresponding mffcc then output of both networks are added together and provided to 3rd neural network as shown in fig. I trained the network. But I am unable to find output of network i.e. generated mfcc.

Please guide how to find mffcc from network output.

Also i have combined frames of all videos together then applied images as a input. Instead of that can I provide input as a video signal.

clear all;
close all;
clc;
files={'AVDIGITS_S1_0_01.mp4';'AVDIGITS_S1_0_02.mp4';'AVDIGITS_S1_0_03.mp4';'AVDIGITS_S1_0_04.mp4';'AVDIGITS_S1_0_05.mp4';...
    'AVDIGITS_S1_1_02.mp4';'AVDIGITS_S1_1_03.mp4';'AVDIGITS_S1_1_04.mp4';'AVDIGITS_S1_1_05.mp4';
    };
mfcc_files={'S1_0_01_mfcc.mp4.avi';'S1_0_02_mfcc.mp4.avi';'S1_0_03_mfcc.mp4.avi';'S1_0_04_mfcc.mp4.avi';'S1_0_05_mfcc.mp4.avi'; ...
    'S1_1_02_mfcc.mp4.avi';'S1_1_03_mfcc.mp4.avi';'S1_1_04_mfcc.mp4.avi';'S1_1_05_mfcc.mp4.avi'}
numFiles = numel(files);
index2=1;
for mm=1:numFiles
    video = readVideo(files{mm});
    fprintf("Reading Video file %d of %d...\n", mm, numFiles)        
    [v1 v2 v3 v4]=size(video); 
    audio = readVideo(mfcc_files{mm});
    fprintf("Reading Audio file %d of %d...\n", mm, numFiles)        
    frame_cnt(mm)=v4;
    for ii=1:v4
        comb_video=video(:,:,:,ii); 
        comb_audio=audio(:,:,ii);
        all_vid_frames(:,:,:,index2)=uint8(comb_video);
        all_audio_frames(:,:,:,index2)=comb_audio;
        index2=index2+1;       
    end  
end
labels1=categorical([zeros(1,209) ones(1,196)]);
idxTrain =[1:121 357:405];
for kk=1:length(idxTrain)
    ind1=idxTrain(kk);
    vid_sequencesTrain(:,:,:,kk) = all_vid_frames(:,:,:,ind1);
    vid_labelsTrain(kk) = labels1(ind1);
    audio_sequencesTrain(:,:,:,kk) = all_audio_frames(:,:,:,ind1);
    audio_labelsTrain = labels1(ind1);
end
idxValidation = [122:356];
for kk=1:length(idxValidation)
    ind2=idxValidation(kk);  
    vid_sequencesValidation(:,:,:,kk) = all_vid_frames(:,:,:,ind2);
    vid_labelsValidation(kk) = labels1(ind2);
    audio_sequencesValidation(:,:,:,kk) = all_audio_frames(:,:,:,ind2);
    audio_labelsValidation(kk) = labels1(ind2);
end
[v1 v2 v3 v4]=size(vid_sequencesTrain)
[a1 a2 a3 a4]=size(audio_sequencesTrain)
imgCells = mat2cell(vid_sequencesTrain,v1,v2,v3,ones(v4,1));
imgCells2 = reshape(imgCells,[v4 1 1]);
audioCells = mat2cell(audio_sequencesTrain,a1,a2,a3,ones(a4,1));
audioCells2 = reshape(audioCells,[a4 1 1]);
labelCells = arrayfun(@(x)x,vid_labelsTrain,'UniformOutput',false);
combinedCells = [imgCells2 audioCells2 labelCells'];
%% validation
[vv1 vv2 vv3 vv4]=size(vid_sequencesValidation)
[aa1 aa2 aa3 aa4]=size(audio_sequencesValidation)
imgCellsvald = mat2cell(vid_sequencesValidation,vv1,vv2,vv3,ones(vv4,1));
imgCells2vald = reshape(imgCellsvald,[vv4 1 1]);
audioCellsvald = mat2cell(audio_sequencesValidation,aa1,aa2,aa3,ones(aa4,1));
audioCells2vald = reshape(audioCellsvald,[aa4 1 1]);
labelCells2vald = arrayfun(@(x)x,audio_labelsValidation,'UniformOutput',false);
combinedCellsvald = [imgCells2vald audioCells2vald labelCells2vald'];
%
save('traingData_10April_2023.mat','combinedCells', 'combinedCellsvald');
filedatastore = fileDatastore('traingData_10April_2023.mat','ReadFcn',@load);
trainingDatastore = transform(filedatastore,@rearrangeData);
layers1 = [
    imageInputLayer([v1 v2 3],'Name','imageinput')  
    convolution2dLayer(3,16,'Padding','same','Name','conv_1')
    batchNormalizationLayer('Name','BN_1')
    reluLayer('Name','relu_1')
    fullyConnectedLayer(2,'Name','fc11')
    additionLayer(2,'Name','add')
    transposedConv2dLayer(3,16,'Name','deconv1');
    batchNormalizationLayer('Name','BN_2')
    reluLayer('Name','relu_2')
    transposedConv2dLayer(3,16,'Name','deconv2');
    batchNormalizationLayer('Name','BN_3')
    reluLayer('Name','relu_3')
    averagePooling2dLayer(2,'Stride',2)
    fullyConnectedLayer(2,'Name','fc12')
    softmaxLayer('Name','softmax')
    classificationLayer('Name','classOutput')];
lgraph = layerGraph(layers1);
layers2 = [imageInputLayer([a1 a2 a3],'Name','vinput')
    fullyConnectedLayer(2,'Name','fc21')];
lgraph = addLayers(lgraph,layers2);
lgraph = connectLayers(lgraph,'fc21','add/in2');
plot(lgraph)
options = trainingOptions('adam', ...
    'InitialLearnRate',0.005, ...
    'LearnRateSchedule','piecewise',...
    'MaxEpochs',100, ...
    'MiniBatchSize',512, ...
    'Verbose',false, ...
    'Plots','training-progress',...
    'Shuffle','never',...
    'ValidationData',trainingDatastore, ...
    'ValidationFrequency',1);
net = trainNetwork(trainingDatastore,lgraph,options);

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Vinayak Choyyan el 14 de Abr. de 2023

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/1944889-how-to-provide-input-without-datastore-to-multiple-input-deep-neural-network#answer_1215618

Hi Shilpa,

As per my understanding, you are trying to input video data into a multi-input model.

Please try to use ‘sequenceInputLayer’ to input video to a model. The documentation to ‘sequenceInputLayer’ can be found here https://www.mathworks.com/help/deeplearning/ref/nnet.cnn.layer.sequenceinputlayer.html.

You can also refer to this example to know more about how to input video to a deep learning model https://www.mathworks.com/help/deeplearning/ug/classify-videos-using-deep-learning-with-custom-training-loop.html.

Please also refer to these documentations to see how multi-input models can be made and trained in MATLAB:

I advise to continue using ‘fileDatastore’ as it will help with memory management and easy input-output.

I hope this resolves the issue you were facing. If you would like to have quicker responses from MathWorks or have our Technical Support Team look at your cases, feel free to make a Technical Service request at https://www.mathworks.com/support/contact_us.html and we would be happy to help you.

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Shilpa Sonawane el 14 de Abr. de 2023

Sir,

I will refere the given link.

Thank you Sir.

Iniciar sesión para comentar.

How to provide input without datastore to multiple input deep neural network?

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Más respuestas (0)

Ver también

Categorías

Etiquetas

Community Treasure Hunt

How to provide input without datastore to multiple input deep neural network?

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

1 comentario Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Más respuestas (0)

Ver también

Categorías

Etiquetas

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos