building a Transformer for sorting numbers

4 visualizaciones (últimos 30 días)
MOHAMMADREZA
MOHAMMADREZA el 25 de Feb. de 2025
Respondida: Gayathri el 15 de Abr. de 2025
Hi,
I am trying to build a tranformer to sort some input numbers. it give error. first it asked me to have output layer which is there as a FC. here is the code, can somebody help me. I have seen many code use FC as an output head
% a complete transformer for sorting numbers
% Clear workspace
clear; clc;
% Check for GPU availability
if canUseGPU
disp('GPU is available. Training on GPU.');
executionEnvironment = 'gpu';
else
disp('GPU is not available. Training on CPU.');
executionEnvironment = 'cpu';
end
GPU is not available. Training on CPU.
% Hyperparameters
numHeads = 1; % Number of attention heads
numLayers = 1; % Number of encoder and decoder layers
embeddingSize = 64; % Embedding size
hiddenSize = 128; % Hidden layer size
maxSequenceLength = 10; % Maximum sequence length
batchSize = 64; % Batch size
numEpochs = 10; % Number of epochs
learningRate = 0.001; % Learning rate
% Generate synthetic dataset
numSamples = 10000;
inputData = rand(numSamples, maxSequenceLength); % Random numbers between 0 and 1
outputData = sort(inputData, 2); % Sorted version of input data
% Split into training and validation sets
splitRatio = 0.8;
numTrain = floor(splitRatio * numSamples);
trainInput = inputData(1:numTrain, :);
trainOutput = outputData(1:numTrain, :);
valInput = inputData(numTrain+1:end, :);
valOutput = outputData(numTrain+1:end, :);
%% converting data to cell
% Define custom transformer encoder layer
for i=1:length(trainInput)
trainInput_cell{i,1}=trainInput(i,:);
trainOutput_cell{i,1}=trainOutput(i,:);
% valInput_cell{i}=valInput(i,:);
% valOutput_cell{i}=valOutput(i,:);
end
for i=1:length(valInput)
% trainInput_cell{i}=trainInput(i,:);
% trainOutput_cell{i}=trainOutput(i,:);
valInput_cell{i,1}=valInput(i,:);
valOutput_cell{i,1}=valOutput(i,:);
end
%% definging networks
% Define the full model
inputLayer = sequenceInputLayer(1, 'Name', 'input'); % Input is a sequence of scalars
embeddingLayer = fullyConnectedLayer(embeddingSize, 'Name', 'embedding');
positionalEncoding = positionalEncodingLayer(maxSequenceLength, embeddingSize, 'positionalEncoding');
Unrecognized function or variable 'positionalEncodingLayer'.
%% ****************************************************************
encoderLayers = [];
for i = 1:numLayers
encoderLayers = [
encoderLayers
multiHeadAttentionLayer(numHeads, embeddingSize, ['encoderAttention', num2str(i)])
additionLayer(2, 'Name', ['encoderAdd' num2str(i)])
layerNormalizationLayer('Name', ['encoderNorm1' num2str(i)])
fullyConnectedLayer(hiddenSize, 'Name', ['encoderFC1' num2str(i)])
reluLayer('Name', ['encoderRelu' num2str(i)])
fullyConnectedLayer(embeddingSize, 'Name', ['encoderFC2' num2str(i)])
additionLayer(2, 'Name', ['encoderAdd2' num2str(i)])
layerNormalizationLayer('Name', ['encoderNorm2' num2str(i)])
];
end
% Define custom transformer decoder layer
decoderLayers = [];
for i = 1:numLayers
decoderLayers = [
decoderLayers
multiHeadAttentionLayer(numHeads, embeddingSize, ['decoderAttention1' num2str(i)])
additionLayer(2, 'Name', ['decoderAdd1' num2str(i)])
layerNormalizationLayer('Name', ['decoderNorm1' num2str(i)])
multiHeadAttentionLayer(numHeads, embeddingSize, ['decoderAttention2' num2str(i)])
additionLayer(2, 'Name', ['decoderAdd2' num2str(i)])
layerNormalizationLayer('Name', ['decoderNorm2' num2str(i)])
fullyConnectedLayer(hiddenSize, 'Name', ['decoderFC1' num2str(i)])
reluLayer('Name', ['decoderRelu' num2str(i)])
fullyConnectedLayer(embeddingSize, 'Name', ['decoderFC2' num2str(i)])
additionLayer(2, 'Name', ['decoderAdd3' num2str(i)])
layerNormalizationLayer('Name', ['decoderNorm3' num2str(i)])
];
end
%% ****************************************************************
% Assemble the encoder
encoder = [
inputLayer
embeddingLayer
% positionalEncoding
encoderLayers
];
% Assemble the decoder
decoder = [
% inputLayer
% embeddingLayer
% positionalEncoding
decoderLayers
];
% Output layer
outputLayer = fullyConnectedLayer(1, 'Name', 'output'); % Predicts a scalar at each time step
% Assemble the full model
layers = [
encoder
decoder
outputLayer
% regressionLayer('Name', 'regression') % Use regression for continuous output
];
% Convert to a layerGraph for visualization
% net = dlnetwork;
% net = addLayers(net,layers);
net=layerGraph(layers);
net = connectLayers(net,"embedding","encoderAdd1/in2");
net = connectLayers(net,"encoderNorm11","encoderAdd21/in2");
net = connectLayers(net,"encoderNorm21","decoderAdd11/in2");
net = connectLayers(net,"decoderNorm11","decoderAdd21/in2");
net = connectLayers(net,"decoderNorm21","decoderAdd31/in2");
% analyzeNetwork(net)
% plot(net)
% Training options
options = trainingOptions('adam', ...
'MaxEpochs', numEpochs, ...
'MiniBatchSize', batchSize, ...
'InitialLearnRate', learningRate, ...
'Shuffle', 'every-epoch', ...
'ValidationData', {valInput, valOutput}, ...
'ValidationFrequency', 30, ...
'ExecutionEnvironment', executionEnvironment, ...
'Plots', 'training-progress', ...
'Verbose', false);
% analyzeNetwork(net)
% Train the model
trained_net = trainNetwork(trainInput, trainOutput, net, options);

Respuestas (1)

Gayathri
Gayathri el 15 de Abr. de 2025
The error is occurring because "positionalEncodingLayer" is not a predefined function in MATLAB. You will need to create this custom layer yourself or use the definition provided in the examples where you found the code.
You can also use the "sinusoidalPositionEncodingLayer" or "positionEmbeddingLayer" to encode position information for which predefined functions are available within MATLAB.
Please refer to the documentation links below to understand about the above mentioned functions.

Categorías

Más información sobre Build Deep Neural Networks en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by