Multilayer Perceptron with More Than One Output and Data Interpretation?
23 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Matthew
el 6 de Ag. de 2024
Editada: Joss Knight
el 7 de Ag. de 2024
I haven't used Matlab's ML or deep learning toolbox since maybe 2022. In the last year, it seems like MathWorks intentionally made the ML portion of Matlab unusable. I'm hoping someone here can help me out because rolling back my matlab two years will make a lot of my recent work unstable. I've done my best to document my process here. Error messages are given in italics for ease of reading.
I have some data that is represented by four doubles that are normalzied to between 0 and 1. I would like to use these four doubles to predict two doubles that are also between 0 and 1. I would like to make a multilayer perceptron that will take in the four values and spit out the two values. However, createMLPNetwork doesn't appear to support multiple outputs - at least, the documentation for it doesn't explain how to do so. So I have tried to make a MLP from scratch using the following code:
%% Prep NN architecture
layers = [
inputLayer([4, 1], "CU", name="in")
flattenLayer
fullyConnectedLayer(20, name="fc1")
reluLayer
fullyConnectedLayerB(16, name="fc2")
reluLayer
fullyConnectedLayer(12, name="fc3")
reluLayer
fullyConnectedLayer(numel(data{1,1}), name="output")
softmaxLayer
];
net = dlnetwork(layers);
[trainedNet, info] = trainnet(train, layers, "mse", opts);
Attempting to run this gives me the following results:
Error using trainnet (line 46)
Error forming mini-batch for network input "in". Data interpreted with format "CU". To specify a different format, use the InputDataFormats option.
Error in trainRutileModel (line 107)
[trainedNet, info] = trainnet(train, layers, "mse", opts);
Caused by:
The data format must have a batch dimension.
I tried to get around using the generic inputLayer function by using the more specific functions like featureInputLayer but replacing the inputLayer with a featureInputLayer throws the following:
Error using trainnet (line 46)
Number of observations in predictors (4) and targets (2) must match. Check that the data and network are consistent.
Error in trainRutileModel (line 107)
[trainedNet, info] = trainnet(train, layers, "mse", opts);
So that won't work because I only have two output data points. The same happened when I tried imageInputLayer. Then I tried replacing "CU" with other values - the following error is for "BU" but the other errors are the same - but got the following:
Error using dlnetwork/initialize (line 558)
Invalid network.
Error in dlnetwork (line 167)
net = initialize(net, dlX{:});
Error in trainRutileModel (line 104)
net = dlnetwork(layers);
Caused by:
Layer 'fc1': Invalid input data for fully connected layer. The input data must have exactly one channel dimension.
So then I tried flattening the output and got the following error:
Error using dlnetwork/initialize (line 558)
Invalid network.
Error in dlnetwork (line 167)
net = initialize(net, dlX{:});
Error in trainRutileModel (line 105)
net = dlnetwork(layers);
Caused by:
Layer 'flatten': Invalid input data. Layer expects data with a channel dimension, but received input data with format "BU".
I'm really not sure what to do here. I have no idea why MathWorks would make this so much more difficult to use and give so much less, and more opaque, documentation. If anyone has any ideas on how to make this work, I'd be happy to hear them. In the meantime, I'm going to take another crack at the createMLPNetwork function and hope that when MathWorks says "a black-box continuous-time or discrete-time neural state-space model with identifiable (estimable) network weights and bias," by "states" they mean "outputs."
0 comentarios
Respuesta aceptada
LeoAiE
el 6 de Ag. de 2024
Hi there!
I have few code examples may help you get started!
% Define network architecture
layers = [
featureInputLayer(4, "Name", "input")
fullyConnectedLayer(20, "Name", "fc1")
reluLayer("Name", "relu1")
fullyConnectedLayer(16, "Name", "fc2")
reluLayer("Name", "relu2")
fullyConnectedLayer(12, "Name", "fc3")
reluLayer("Name", "relu3")
fullyConnectedLayer(2, "Name", "output") % 2 outputs for regression
];
% Convert to dlnetwork
net = dlnetwork(layers);
options = trainingOptions('sgdm', ...
'MiniBatchSize', 32, ...
'MaxEpochs', 100, ...
'InitialLearnRate', 1e-3, ...
'Shuffle', 'every-epoch', ...
'Verbose', true, ...
'Plots', 'training-progress');
% Assuming trainData is your input data matrix of size [numObservations, 4]
% and trainTargets is your target data matrix of size [numObservations, 2]
% Split data into training and validation sets if needed
% [trainInd, valInd] = dividerand(numel(trainData), 0.8, 0.2);
% Prepare datastore if data is large
% ds = arrayDatastore({trainData, trainTargets}, 'IterationDimension', 1);
% Train the network
[trainedNet, info] = trainNetwork(trainData, trainTargets, layers, options);
% Data preparation
trainData = rand(1000, 4); % Example data, replace with your actual data
trainTargets = rand(1000, 2); % Example data, replace with your actual data
% Define network architecture
layers = [
featureInputLayer(4, "Name", "input")
fullyConnectedLayer(20, "Name", "fc1")
reluLayer("Name", "relu1")
fullyConnectedLayer(16, "Name", "fc2")
reluLayer("Name", "relu2")
fullyConnectedLayer(12, "Name", "fc3")
reluLayer("Name", "relu3")
fullyConnectedLayer(2, "Name", "output") % 2 outputs for regression
];
% Convert to dlnetwork
net = dlnetwork(layers);
% Training options
options = trainingOptions('sgdm', ...
'MiniBatchSize', 32, ...
'MaxEpochs', 100, ...
'InitialLearnRate', 1e-3, ...
'Shuffle', 'every-epoch', ...
'Verbose', true, ...
'Plots', 'training-progress');
% Train the network
[trainedNet, info] = trainNetwork(trainData, trainTargets, layers, options);
analyzeNetwork(layers);
Más respuestas (1)
Joss Knight
el 7 de Ag. de 2024
You can continue to use trainNetwork if you don't want to use dlnetwork. dlnetwork obviously provides much more flexibility as well as the ability to format your data however you like (which was tripping you up), but you don't have to use it.
2 comentarios
Joss Knight
el 7 de Ag. de 2024
Editada: Joss Knight
el 7 de Ag. de 2024
That's totally fair. A future version will remove the requirement for you to describe your data format, and make it more like other frameworks where certain input layouts are expected for certain types of network.
dlnetwork has the same requirements for other frameworks for you to define the correct layout for your data, but instead of permuting your data into a required format, you label which dimension is which. To use an inputLayer and for fullyConnectedLayer you need at least to explain which dimension has the channels of your data and which has the batch dimension, in the same way you might be required in pyTorch or Tensorflow to provide your data layed out with each row a different batch of data and each column a different channel. The documentation tries to explain as completely as it can what each label means.
Ver también
Categorías
Más información sobre Image Data Workflows en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!