Borrar filtros
Borrar filtros

How to reduce large Computational time

4 visualizaciones (últimos 30 días)
Navdeep Sony
Navdeep Sony el 11 de Mzo. de 2017
Comentada: Navdeep Sony el 13 de Mzo. de 2017
Please find the code attached below. Also please find three .mat files that have been attached. The problem is that the code takes very long time to execute. Kindly help what can be done to reduce time. Thanks.
clc,clear;
load('E');
load('R');
load('P');
sparse=zeros(14596,2209);
final=zeros(144,2209);
parfor ii=1:size(tem4,2)
dictionary=tem2;
t=5;
s=tem4(:,ii);
r=s;
atoms=zeros(size(dictionary,1),size(dictionary,2));
coefs=zeros(size(dictionary,2),1);
%Normalize the dictionary
for index=1:size(dictionary,2)
dictionary(:,index)=dictionary(:,index)./norm(dictionary(:,index));
end
D=dictionary;
% plotv(dictionary);
% hold on
% plotv(s);
index=[];
while(t>1e-15 && sum(dictionary(:)~=0))%Process while (Eucledian norm > 10^-15)
inner_product=dictionary'*r; %Dot Product
% inner_product=dot(dictionary,r');
[m,ind]=max(abs(inner_product));
index=[index ind];
atoms(:,ind)=dictionary(:,ind); %Select atom which has maximum inner product
dictionary(:,ind)=0;
x=(atoms(:,index)'*atoms(:,index))\(atoms(:,index)'*r);
coefs(index)=x;
r=r-atoms(:,index)*x;
t=norm(r);
% plotv(r);
% hold on
end
sparse(:,ii)=coefs;
sig=D*coefs;
ff=uint8(((((max(s))-min(s))/((max(sig)-min(sig))))*(sig-min(sig)))+min(s));
final(:,ii)=ff;
end

Respuesta aceptada

Jan
Jan el 12 de Mzo. de 2017
Editada: Jan el 12 de Mzo. de 2017
function final = testfcn
% data1 = load('E'); % Not used
data2 = load('R');
data3 = load('P');
% tem1 = data1.tem1;
tem2 = data2.tem2;
tem4 = data3.tem4;
sparse = zeros(14596, 2209);
final = zeros(144, 2209, 'uint8');
tic
% Normalize the dictionary - outside the ii loop:
D = zeros(size(tem2));
for index = 1:size(D, 2)
D(:,index) = tem2(:,index) ./ norm(tem2(:,index));
% v = dictionary(:, index);
% dictionary(:,index) = v ./ sqrt(v' * v); % Not exactly the same!!!
end
warning('OFF', 'MATLAB:nearlySingularMatrix');
parfor ii = 1:15 % size(tem4,2)
t = 5;
s = tem4(:, ii);
r = s;
atoms = zeros(size(D));
coefs = zeros(size(D, 2),1);
dictionary = D;
index = [];
while t > 1e-15 && any(dictionary(:))
% inner_product = dictionary' * r; % Dot Product
% [m, ind] = max(abs(inner_product));
ind = GetMaxIndex(r, dictionary);
index = [index, ind]; % Ugly: growing vector!
atoms(:, ind) = dictionary(:, ind); % Select maximum atom
dictionary(:, ind) = 0;
x = (atoms(:, index).' * atoms(:, index)) \ (atoms(:, index).' * r);
r = r - atoms(:, index) * x;
% v = atoms(:, index); % Not exactly the same!!!
% x = (v.' * v) \ (v.' * r);
% r = r - v * x;
t = norm(r);
end
coefs(index) = x; % Outside the loop
sparse(:,ii) = coefs;
sig = D * coefs;
final(:, ii) = (((max(s)) - min(s)) / (max(sig) - min(sig))) * ...
(sig - min(sig)) + min(s);
end
warning('ON', 'MATLAB:nearlySingularMatrix');
toc
And GetMaxIndex is a C-mex function:
// Author: Jan Simon 2017, License: CC BY-SA 3.0
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
// GetMaxIndex(r, dict)
// [~, ind] = max(abs(dict' * r));
double *r, *dict; // Inputs
size_t nr, ndict;
double *rEnd, *s, dot, maxValue; // Working
size_t i, maxIndex;
r = mxGetPr(prhs[0]); // Obtain inputs
nr = mxGetNumberOfElements(prhs[0]);
rEnd = r + nr;
dict = mxGetPr(prhs[1]);
ndict = mxGetN(prhs[1]);
maxValue = 0; // Calculation
maxIndex = 0;
for (i = 0; i < ndict; i++) {
dot = 0.0; // Dot product
for (s = r; s < rEnd; dot += *s++ * *dict++) ; // empty loop
dot = fabs(dot);
if (dot > maxValue) { // Remember maximum index
maxValue = dot;
maxIndex = i;
}
}
plhs[0] = mxCreateDoubleScalar((double) (maxIndex + 1));
return;
}
Please note the 2 comments "Not exactly the same!!!". I've commented the sections out, although they are faster and differ in the magnitude of eps() only in the resutls. But this changes the output final already, which means, that your algorithm is instable: The result depends on tiny rounding effects. Does this satisfy your needs?
For the first 15 iterations for ii this needs 10.8 sec instead of 31.18 sec - R2016b/64/Win7 without Parallel Computing Toolbox. The full problem will need 150 times longer divided by the number of cores.
  1 comentario
Navdeep Sony
Navdeep Sony el 13 de Mzo. de 2017
Thanks Jan. Your suggestions are really helpful though I have not tried mex files but surely will now.

Iniciar sesión para comentar.

Más respuestas (1)

Jan
Jan el 11 de Mzo. de 2017
Editada: Jan el 12 de Mzo. de 2017
  • Start with replacing the inefficient sum(dictionary(:)~=0) to:
while(t>1e-15 && any(dictionary(:)))
This works, if dictionary does not contain negative values. Is this assumption correct? This saves 60% processing time in my R2016b/64/Core2Duo without Parallel Computing Toolbox.
  • Slightly faster for the normalization:
for index = 1:size(dictionary, 2)
v = dictionary(:, index);
dictionary(:, index) = v ./ sqrt(v' * v);
end
To my surprise this changes the result, although the changes in the variable dictionary are in the maginitude of eps()! This means, that your algorithm is instable. Keep care.
Most of all the normalization can be moved before the loop instead of calculating it repeatedly.
  • Create "final" with the correct type, if you store uint8 values in it:
final = zeros(144, 2209, 'uint8');
  • But now inspect the loop:
while t > 1e-15 && any(dictionary(:)) % Process while norm > 1e-15
inner_product = r' * dictionary; % Dot Product
[m, ind] = max(abs(inner_product));
index = [index, ind];
atoms(:, ind) = dictionary(:, ind); % Select maximum atom
dictionary(:, ind) = 0;
v = atoms(:, index);
x = (v' * v) \ (v' * r); % Abbreviated, further 10% faster
coefs(index) = x;
r = r - v * x;
t = norm(r);
end
The bottleneck is the dot product. About 50% of the columns are set to 0 in average, such that the code might be twice as fast, when these columns are ignored in the search of the maximum. This can be done in a C-Mex function easily. But do you have a C-compiler installed?
  • Move "coefs(index) = x;" behind the loop, because only the last value matters.
[EDITED] It is less efficient than I thought to ignore the columns set to 0 before. I post my final version in a second answer to increase the readability.

Categorías

Más información sobre Historical Contests en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by