Hi,
I would like to make my curve fitting faster by using GPU. Is it possible? If yes how?
Thanks

1 comentario

Matt J
Matt J el 25 de Mzo. de 2019
Marek's problem description copied here for reference:
I have MRI data of brain (each subject has approx. 0,5 mil voxels) with 10 different parameters and I want to in each voxel fit this 10 points by specific biexponencial curve (so I receive parametric maps of brain). I tried to optimize the script before I used parallel computing (1 brain ~4 hours) but I used only parfor (~1,25 hours) and because I am still beginner with Matlab I was curious if other users have experience with GPU and can help me make this faster than parfor.
Thanks
Marek

Iniciar sesión para comentar.

 Respuesta aceptada

Matt J
Matt J el 26 de Nov. de 2018
Editada: Matt J el 21 de Jun. de 2019

0 votos

I want to in each voxel fit this 10 points by specific biexponencial curve (so I receive parametric maps of brain)
@Marek,
Do not fit one voxel at a time in a loop. Instead, write the model as an N-dimensional curve where N is the number of voxels and apply lsqcurvefit to this model. Do not let lsqcurvefit use its default finite difference derivative calculations. Instead, supply your own Jacobian*x operator using the JacobianMultiplyFcn option (EDIT: or provide a sparse block diagonal Jacobian matrix computation).
Your problem is a good example of what I was discussing with Joss: both the model function and the JacobianMultiplyFcn could be GPU-optimized if lsqcurvefit could work with gpuArray variables without constantly doing GPU/CPU transfers. However, I expect the fit will still be reasonably fast if you use appropriate Matlab vectorization techniques.

16 comentarios

Marek Dostal
Marek Dostal el 27 de Nov. de 2018
Thank you Matt,
I will need more time to fully understand your answer and writte a script :) but I can try it.
sarah goldberg
sarah goldberg el 25 de Mzo. de 2019
@Marek, @Matt,
I am trying to implement Matt J's suggestion regarding vectorizing for N fits using lscurvefit (see code below for loop over 1D fits vs. vectorized fit - the vectorized version withJacobian multiplication function is not working yet).
My main problems:
1) I do not understand what the dimensions of the different products returned by the Jacobian multiplication function should be. How can both C*q and C'*q be valid if C is not quare?
2) I do not understand how to pass the Jacobian multiplication function (first attempt below).
Does anyone have a working example that also compares the runtime to the looped 1D case?
Thanks!
Sarah
clear all; close all; fclose all;
% model is a*exp(-0.5*((x-m)/s)^2), with different a, m, s for each row
num_g=10; % number of gaussians (rows)
a0=4;
a_std=1;
mu0=40;
mu_std=15;
sigma0=3;
sigma_std=1;
xx=[1:1:100]; % these are the pixel values at which we will have data
% generate data
mus=mu0 + mu_std*randn(1,num_g);
sigmas=sigma0 + sigma_std*randn(1,num_g);
as=a0 + a_std*randn(1,num_g);
sigmas(sigmas==0)=sigma0; % to avoid division by 0
% 1D data - each column is a different object
mu_mat=ones(size(xx,2),1)*mus;
sigma_mat=ones(size(xx,2),1)*sigmas;
a_mat=ones(size(xx,2),1)*as;
xx_mat=xx'*ones(1,size(mus,2));
yy_mat=a_mat.*exp( - (((xx_mat-mu_mat)./sigma_mat).^2)/2 );
%% 1D fit, looped over rows (for comparison)
fun_1D=@(x,xdata)x(3)*exp( - (((xdata-x(1))./x(2)).^2)/2 ); % gaussian model
opts1=optimset('display','off');
p_guess=nan*ones(num_g,3); % guess
p_fit=nan*ones(num_g,3+2); % fit results
tic
for ii=1:num_g
[maxval,maxpos]=max(yy_mat(:,ii));
[~,minpos]=min(abs(yy_mat(:,ii) - maxval/2));
p0=[xx_mat(maxpos,ii),abs(minpos-maxpos)+1,maxval];
p_guess(ii,:)=p0;
try
[x,resnorm,residual,exitflag,output]=lsqcurvefit(fun_1D,p0,xx_mat(:,ii)',yy_mat(:,ii)',[],[],opts1);
p_fit(ii,:)=[x,resnorm,exitflag];
catch
continue
end
end
fprintf(1,['\n1D, n_g=' num2str(num_g) '\n']);
toc
% plot first 10 to check
figure('name','1D');clf;hold on;
xx_dense=[min(xx_mat):0.1:max(xx_mat)];
legs={};
for ii=1:min(num_g,10)
ph=plot(xx_mat(:,ii),yy_mat(:,ii),'.'); % data
plot(xx_dense,fun_1D(p_guess(ii,1:3),xx_dense),':','color',get(ph,'color')); % guess
plot(xx_dense,fun_1D(p_fit(ii,1:3),xx_dense),'-','color',get(ph,'color')); % fit
end
xlabel('xx [pixel]');
ylabel('yy [intensity]');
%% ND without Jacobian multiplication function
fun_ND=@(x,xdata)x(:,3)*ones(1,size(xdata,2)).*exp(- (((xdata - (x(:,1)*ones(1,size(xdata,2))))./ (x(:,2)*ones(1,size(xdata,2)))).^2)/2 ); % gaussian model
opts1=optimset('display','off');
tic
p0_ND=nan*ones(num_g,3); % guess
p_fit=nan*ones(num_g,3+2); % fit results
% build guess matrix
for ii=1:num_g
[maxval,maxpos]=max(yy_mat(:,ii));
[~,minpos]=min(abs(yy_mat(:,ii) - maxval/2));
p0_ND(ii,:)=[xx_mat(maxpos,ii),abs(minpos-maxpos)+1,maxval]; % each row is 1 object
end
[pfit_ND,resnorm,residual,exitflag,output]=lsqcurvefit(fun_ND,p0_ND,xx_mat',yy_mat',[],[],opts1); % each row is 1 object
fprintf(1,['\nND, no J supplied, n_g=' num2str(num_g) '\n']);
toc
% plot first 10 to check
figure('name','ND');clf;hold on;
%figure(fh);hold on;
maxind=min(10,num_g);
yy_fit=fun_ND(pfit_ND,xx_mat'); yy_fit=yy_fit';
yy_guess=fun_ND(p0_ND,xx_mat');yy_guess=yy_guess';
plot(xx_mat(:,1:maxind),yy_guess(:,1:maxind),':','color',0.5*[1, 1, 1]);
plot(xx_mat(:,1:maxind),yy_fit(:,1:maxind),'.-');
xlabel('xx [pixel]');
ylabel('yy [intensity]');
%% ND fit with Jacobian multiplication function
if(0)
fun_ND=@(x,xdata)x(:,3)*ones(1,size(xdata,2)).*exp(- (((xdata - (x(:,1)*ones(1,size(xdata,2))))./ (x(:,2)*ones(1,size(xdata,2)))).^2)/2 ); % gaussian model
Jinfo = [speye(n);speye(n)]; % sparse matrix for preconditioning - required input, not used
opts2 = optimoptions('lsqcurvefit','JacobianMultiplyFcn',@(Jinfo,Y,flag)myJ(Jinfo,Y,flag,p));
p0_ND=nan*ones(num_g,3); % guess
p_fit=nan*ones(num_g,3+2); % fit results
tic
% build guess matrix
for ii=1:num_g
[maxval,maxpos]=max(yy_mat(:,ii));
[~,minpos]=min(abs(yy_mat(:,ii) - maxval/2));
p0_ND(ii,:)=[xx_mat(maxpos,ii),abs(minpos-maxpos)+1,maxval]; % each row is 1 object
end
[pfit_ND,resnorm,residual,exitflag,output]=lsqcurvefit(fun_ND,p0_ND,xx_mat',yy_mat',[],[],opts2); % each row is 1 object
fprintf(1,['\nND, J multiplication supplied, n_g=' num2str(num_g) '\n']);
toc
% print to check
figure(3);clf;hold on;
maxind=min(10,num_g);
yy_fit=fun_ND(pfit_ND,xx_mat'); yy_fit=yy_fit';
yy_guess=fun_ND(p0_ND,xx_mat');yy_guess=yy_guess';
plot(xx_mat(:,1:maxind),yy_guess(:,1:maxind),'.-');
xlabel('xx [pixel]');
ylabel('yy [intensity]');
end
% compute Jacobian multiply functions
function w = myJ(Jinfo,Y,flag,p)
if flag > 0
w = Jpositive(Y);
elseif flag < 0
w = Jnegative(Y);
else
w = Jnegative(Jpositive(Y));
end
function a = Jpositive(q)
% Calculate C*q
a = zeros(len,size(q,2)); % not clear what size a should be
% something here
end
function a = Jnegative(q)
% Calculate C'*q
a = zeros(len,size(q,2)); % not clear what size a should be
% something here
end
end
Matt J
Matt J el 25 de Mzo. de 2019
Editada: Matt J el 25 de Mzo. de 2019
1) I do not understand what the dimensions of the different products returned by the Jacobian multiplication function should be. How can both C*q and C'*q be valid if C is not quare?
The dimensions of q are different in each case. If C is the Jacobian, then it should always be treated as an NxP matrix where N is the total number of elements in the output of your model function and P is the total number of unknown fit parameters. In other words, Jpositive must accept input that is P-dimensional and return a result that is N-dimensional, while for Jnegative it is the reverse.
2) I do not understand how to pass the Jacobian multiplication function (first attempt below).
It looks fine, but you also need to set SpecifyObjectiveGradient to true.
sarah goldberg
sarah goldberg el 27 de Mzo. de 2019
Editada: sarah goldberg el 27 de Mzo. de 2019
Thanks Matt for the quick response.
I filled in the missing parts of the Jacobian multiply myJ, assuming a certain ordering of the parameters. My questions:
1) What is the ordering of the fit parameters for the Jacobian index j? I assumed that for num_g fits, 3 parameters [mu, sigma, a] per fit, the order is:
[mu1,mu2,...,mu_num_g, sigma1,sigma2,...,sigma_num_g, a1,a2,...,a_num_g] .
Is this correct?
2) I do not know how to pass additional arguments to the Jacobian multiplication function myJ. I'm attaching the current code, which does NOT compile.
Thanks for your help.
Sarah
%fun_ND=@(x,xdata)x(:,3)*ones(1,size(xdata,2)).*exp(- (((xdata - (x(:,1)*ones(1,size(xdata,2))))./ (x(:,2)*ones(1,size(xdata,2)))).^2)/2 ); % gaussian model
opts2 = optimoptions('lsqcurvefit','JacobianMultiplyFcn',@(Jinfo,Y,flag,f,P,n,m,k)myJ(Jinfo,Y,flag,f,P,n,m,k),'SpecifyObjectiveGradient',true);
p0_ND=nan*ones(num_g,3); % guess
p_fit=nan*ones(num_g,3+2); % fit results
tic
% build guess matrix
for ii=1:num_g
[maxval,maxpos]=max(yy_mat(:,ii));
[~,minpos]=min(abs(yy_mat(:,ii) - maxval/2));
p0_ND(ii,:)=[xx_mat(maxpos,ii),abs(minpos-maxpos)+1,maxval]; % each row is 1 object
end
[pfit_ND,resnorm,residual,exitflag,output]=lsqcurvefit(@(x,xdata)fun_ND_withJ(p0_ND,xx_mat'), p0_ND, xx_mat', yy_mat',[],[],opts2); % each row is 1 object
fprintf(1,['\nND, J multiplication supplied, n_g=' num2str(num_g) '\n']);
toc
% plot to check
figure('name','ND with Jacobian');clf;hold on;
maxind=min(10,num_g);
yy_fit=fun_ND(pfit_ND,xx_mat'); yy_fit=yy_fit';
yy_guess=fun_ND(p0_ND,xx_mat');yy_guess=yy_guess';
plot(xx_mat(:,1:maxind),yy_guess(:,1:maxind),'.-');
xlabel('xx [pixel]');
ylabel('yy [intensity]');
function [f,Jinfo,fcopy,x,n,m,k] = fun_ND_withJ(x,xdata)
% x is num_g x num_params_per_fit (10x3)
% xdata is num_g x num_points_per_fit (10x100)
% f is num_g x num_points_per_fit (10x100)
f=x(:,3)*ones(1,size(xdata,2)).*exp(- (((xdata - (x(:,1)*ones(1,size(xdata,2))))./ (x(:,2)*ones(1,size(xdata,2)))).^2)/2 ); % gaussian model
fcopy=f; % cannot have 2 outputs with the same name, do not know how to pass f to myJ
n=size(xdata,2);
m=size(x,2);
k=size(x,1);
% Jinfo sparse matrix for preconditioning - required input.
% size should equal the size of the Jacobian C.
Jinfo = speye( prod(size(xdata)), prod(size(x)));
end
% compute Jacobian multiply functions.
% note: Jacobian C is sparse. the only nonzero elements are
% derivatives of values fi belonging to fit j
% with respect to their own fitting parameters [mj,sj,aj].
% for xi "connected" to params Pj=[mj,sj,aj], we have:
% for mu==mj: dfi/dmj = -fi*(xi-mj)/sj^2
% for sigma==sj: dfi/dsj = fi*(xi-mj)^2/sj^3
% for a==aj: dfi/daj = fi/aj
% we need f=[xdata(1);xdata(2);...;xdata(num_g)] and parameters Pj for all j
function w = myJ(Jinfo,Y,flag,f,P,n,m,k)
if flag > 0
w = Jpositive(Y);
elseif flag < 0
w = Jnegative(Y);
else
w = Jnegative(Jpositive(Y));
end
function a = Jpositive(q,f,P,n,m,k)
% Calculate C*q (dimensions NxP, Px1 => Nx1)
a = zeros(size(f,1));
for ff=1:k % loop over fits
p=P(m*(ff-1)+1:m*ff); % params for this fit (m)
fp=f(n*(ff-1)+1:n*ff); % points for this fit (n)
qp=q(n*(ff-1)+1, n*(ff-1)+1+m , n*(ff-1)+1+2*m); % part of input vector that contributes (m)
a(fp)=fp.*( -qp(1)*(fp-p(1))/p(2)^2 + qp(2)*(fp-p(1)).^2/p(2)^3+ qp(3)/p(3));
end
end
function a = Jnegative(q,f,P,n,m,k)
% Calculate C'*q (dimensions PxN, Nx1 => Px1)
a = zeros(size(P),1);
for ff=1:k % loop over fits
p=P(m*(ff-1)+1:m*ff); % params for this fit (m)
fp=f(n*(ff-1)+1:n*ff); % points for this fit (n)
qp=q(n*(ff-1)+1:n*ff); % part of input vector that contributes (n)
a(ff)= sum(-fp.*qp.*(fp-p(1))) /p(2)^2; % mus, at positions [1:k]
a(ff+k)= sum( fp.*qp.*(fp-p(1)).^2)/p(3)^3; % sigmas, at [k+1:2*k]
a(ff+2*k)=sum( fp.*qp) /p(3); % as, at [2*k+1:3*k]
end
end
end
Matt J
Matt J el 27 de Mzo. de 2019
Editada: Matt J el 27 de Mzo. de 2019
1) What is the ordering of the fit parameters for the Jacobian index j?
The same as the ordering you chose for your initial guess p0_ND
2) I do not know how to pass additional arguments to the Jacobian multiplication function myJ.
@(Jinfo,Y,flag) myJ(Jinfo,Y,flag,f,P,n,m,k)
sarah goldberg
sarah goldberg el 27 de Mzo. de 2019
Editada: sarah goldberg el 27 de Mzo. de 2019
Hi Matt,
Still not working. I am getting an error
Error using NDfit_JacobianMult>@(Jinfo,Y,flag)myJ(Jinfo,Y,flag,f,P,n,m,k)
Too many input arguments.
Error in lsqcurvefit/jacobmult (line 277)
W = feval(mtxmpy_xdata,Jinfo,Y,flag,XDATA,varargin{:});
Error in snls (line 204)
g = feval(mtxmpy,A,fvec(:,1),-1,varargin{:});
Error in lsqncommon (line 166)
snls(funfcn,xC,lb,ub,flags.verbosity,options,defaultopt,initVals.F,initVals.J,caller, ...
Error in lsqcurvefit (line 257)
lsqncommon(funfcn,xCurrent,lb,ub,options,defaultopt,caller,...
Error in NDfit_JacobianMult (line 148)
[pfit_ND_J,resnorm,residual,exitflag,output]=lsqcurvefit(@fun_ND_withJ, p0_ND_J, xx_mat', yy_mat',[],[],opts2); % each row is 1 object
from the attached code (below).
Replacing opts2 with opts1 works, so the problem is definitely with the definition of the Jacobian Multiply function.
I tried using global parameters instead of passing them, did not work.
I tried having the fit function return the parameters, did not work.
Apparently myJ is never entered.
How do the parameters get passed to the myJ lsqcurvefit? This is where the problem seems to be.
Thanks,
Sarah
clear all; close all; fclose all;
% model is a*exp(-0.5*((x-m)/s)^2), with different a, m, s for each row
num_g=20; % number of gaussians (rows)
a0=4;
a_std=1;
mu0=40;
mu_std=15;
sigma0=3;
sigma_std=1;
xx=[1:1:100]; % these are the pixel values at which we will have data
% generate data
mus=mu0 + mu_std*randn(1,num_g);
sigmas=sigma0 + sigma_std*randn(1,num_g);
as=a0 + a_std*randn(1,num_g);
sigmas(sigmas==0)=sigma0; % to avoid division by 0
% 1D data - each column is a different object
mu_mat=ones(size(xx,2),1)*mus;
sigma_mat=ones(size(xx,2),1)*sigmas;
a_mat=ones(size(xx,2),1)*as;
xx_mat=xx'*ones(1,size(mus,2));
yy_mat=a_mat.*exp( - (((xx_mat-mu_mat)./sigma_mat).^2)/2 );
%% ND fit with Jacobian multiplication function
%global G_f G_P G_m G_n G_k;
opts1=optimset('display','off');
opts2 = optimoptions('lsqcurvefit','SpecifyObjectiveGradient',true,'JacobianMultiplyFcn',@(Jinfo,Y,flag)myJ(Jinfo,Y,flag,f,P,n,m,k));
p0_ND_J=nan*ones(num_g,3); % guess
p_fit_J=nan*ones(num_g,3+2); % fit results
tic
% build guess matrix
for ii=1:num_g
[maxval,maxpos]=max(yy_mat(:,ii));
[~,minpos]=min(abs(yy_mat(:,ii) - maxval/2));
p0_ND_J(ii,:)=[xx_mat(maxpos,ii),abs(minpos-maxpos)+1,maxval]; % each row is 1 object
end
[pfit_ND_J,resnorm,residual,exitflag,output]=lsqcurvefit(@fun_ND_withJ, p0_ND_J, xx_mat', yy_mat',[],[],opts2); % each row is 1 object
fprintf(1,['\nND, J multiplication supplied, n_g=' num2str(num_g) '\n']);
toc
% plot to check
figure('name','ND with Jacobian');clf;hold on;
maxind=min(10,num_g);
yy_fit=fun_ND_withJ(pfit_ND_J,xx_mat');yy_fit=yy_fit';
yy_guess=fun_ND_withJ(p0_ND_J,xx_mat');yy_guess=yy_guess';
plot(xx_mat(:,1:maxind),yy_guess(:,1:maxind),':','color',0.5*[1, 1, 1]);hold on;
plot(xx_mat(:,1:maxind),yy_fit(:,1:maxind),'.-');
xlabel('xx [pixel]');
ylabel('yy [intensity]');
% model for ND fit with Jacobian multiplication
function [f,Jinfo] = fun_ND_withJ(x,xdata)
%global G_f G_P G_m G_n G_k;
% x is num_g x num_params_per_fit (10x3)
% xdata is num_g x num_points_per_fit (10x100)
% f is num_g x num_points_per_fit (10x100)
f=x(:,3)*ones(1,size(xdata,2)).*exp(- (((xdata - (x(:,1)*ones(1,size(xdata,2))))./ (x(:,2)*ones(1,size(xdata,2)))).^2)/2 ); % gaussian model
%fcopy=f; % cannot have 2 outputs with the same name, do not know how to pass f to myJ
% G_f=f;
% G_P=x;
% G_n=size(xdata,2);
% G_m=size(x,2);
% G_k=size(x,1);
% Jinfo sparse matrix for preconditioning - required input.
% size should equal the size of the Jacobian C.
Jinfo = speye( prod(size(xdata)), prod(size(x)));
end
% compute Jacobian multiply functions.
% note: Jacobian C is sparse. the only nonzero elements are
% derivatives of points xi belonging to fit j
% with respect to their own fitting parameters [mj,sj,aj].
% for xi "connected" to params Pj=[mj,sj,aj], we have:
% for mu==mj: dxi/dmj = -fi*(xi-mj)/sj^2
% for sigma==sj: dxi/dsj = fi*(xi-mj)^2/sj^3
% for a==aj: dxi/daj = fi/aj
% we need f=[xdata(1);xdata(2);...;xdata(num_g)] and parameters Pj for all j
function w = myJ(Jinfo,Y,flag,f,P,n,m,k)
%global G_f G_P G_m G_n G_k;
'here'
% f=G_f; %Y;
% P=G_P;
% n=G_n;
% m=G_m;
% k=G_k;
if flag > 0
w = Jpositive(Y,f,P,n,m,k);
elseif flag < 0
w = Jnegative(Y,f,P,n,m,k);
else
w = Jnegative(Jpositive(Y,f,P,n,m,k));
end
function a = Jpositive(q,f,P,n,m,k)
% Calculate C*q (dimensions NxP, Px1 => Nx1)
a = zeros(size(f,1));
for ff=1:k % loop over fits
p=P(m*(ff-1)+1:m*ff); % params for this fit (m)
fp=f(n*(ff-1)+1:n*ff); % points for this fit (n)
qp=q(n*(ff-1)+1, n*(ff-1)+1+m , n*(ff-1)+1+2*m); % part of input vector that contributes (m)
a(fp)=fp.*( -qp(1)*(fp-p(1))/p(2)^2 + qp(2)*(fp-p(1)).^2/p(2)^3+ qp(3)/p(3));
end
end
function a = Jnegative(q,f,P,n,m,k)
% Calculate C'*q (dimensions PxN, Nx1 => Px1)
a = zeros(size(P),1);
for ff=1:k % loop over fits
p=P(m*(ff-1)+1:m*ff); % params for this fit (m)
fp=f(n*(ff-1)+1:n*ff); % points for this fit (n)
qp=q(n*(ff-1)+1:n*ff); % part of input vector that contributes (n)
a(ff)= sum(-fp.*qp.*(fp-p(1))) /p(2)^2; % mus, at positions [1:k]
a(ff+k)= sum( fp.*qp.*(fp-p(1)).^2)/p(3)^3; % sigmas, at [k+1:2*k]
a(ff+2*k)=sum( fp.*qp) /p(3); % as, at [2*k+1:3*k]
end
end
end
Matt J
Matt J el 29 de Mzo. de 2019
Editada: Matt J el 29 de Mzo. de 2019
f,P,n,m,k need to be provided in this line
opts2 = optimoptions(___,
'JacobianMultiplyFcn',@(Jinfo,Y,flag)myJ(Jinfo,Y,flag,f,P,n,m,k));
Catalytic
Catalytic el 29 de Mzo. de 2019
Editada: Matt J el 29 de Mzo. de 2019
Even with this fixed, though, it does appear that lsqcurvefit is trying to call the JacobianMultiplyFcn with more than 3 input arguments in this line inside lsqcurvefit
function W = jacobmult(Jinfo,Y,flag,varargin)
W = feval(mtxmpy_xdata,Jinfo,Y,flag,XDATA,varargin{:});
end
I agree it is very strange.
Matt J
Matt J el 29 de Mzo. de 2019
That is curious. For now, I recommend defining the JacobMult function with a 4th argument, even if you aren't going to use it.
opts2 = ...
optimoptions(___,'JacobianMultiplyFcn',@(Jinfo,Y,flag,xdata)myJ(Jinfo,Y,flag,f,P,n,m,k));
sarah goldberg
sarah goldberg el 3 de Abr. de 2019
Editada: sarah goldberg el 3 de Abr. de 2019
Matt J, Catalytic -
Thanks for the tips. The following code compiles, and the vectorized version is faster for large num_g, but the vectorized version does not fit as well as the non-vectorized (the fits are certainly not good enough). Any ideas why?
Sarah
(edited to correct Jacobian. surprisingly it had little effect on the fit.)
results:
Capture.PNG
generated by this code:
clear all; close all; fclose all;
% model is a*exp(-0.5*((x-m)/s)^2), with different a, m, s for each row
num_g=10; % number of gaussians (rows)
maxind=min(10,num_g); % number of gussians we will plot
a0=4;
a_std=5;
mu0=40;
mu_std=30;
sigma0=3;
sigma_std=1;
xx=[1:1:100]; % these are the pixel values at which we will have data
% % generate random data
% mus=mu0 + mu_std*randn(1,num_g);
% sigmas=sigma0 + sigma_std*randn(1,num_g);
% as=a0 + a_std*randn(1,num_g);
% sigmas(sigmas==0)=sigma0; % to avoid division by 0
% generate ordered data (for debugging)
mus=mu0 + mu_std*[1:num_g]/num_g;
sigmas=sigma0 + sigma_std*[1:num_g]/num_g;
as=a0 + a_std*[1:num_g]/num_g;
% 1D data - each column is a different object
mu_mat=ones(size(xx,2),1)*mus;
sigma_mat=ones(size(xx,2),1)*sigmas;
a_mat=ones(size(xx,2),1)*as;
xx_mat=xx'*ones(1,size(mus,2));
yy_mat=a_mat.*exp( - (((xx_mat-mu_mat)./sigma_mat).^2)/2 );
%% 1D fit, looped over rows (for comparison)
fun_1D=@(x,xdata)x(3)*exp( - (((xdata-x(1))./x(2)).^2)/2 ); % gaussian model
opts1=optimset('display','off');
p_guess=nan*ones(num_g,3); % guess
pfit=nan*ones(num_g,3+2); % fit results
% find starting points==guesses
for ii=1:num_g
[maxval,maxpos]=max(yy_mat(:,ii));
[~,minpos]=min(abs(yy_mat(:,ii) - maxval/2));
p_guess(ii,:)=[xx_mat(maxpos,ii),abs(minpos-maxpos)+1,maxval];
end
% fit
tic
for ii=1:num_g
[x,resnorm,residual,exitflag,output]=lsqcurvefit(fun_1D,p_guess(ii,:),xx_mat(:,ii)',yy_mat(:,ii)',[],[],opts1);
pfit(ii,:)=[x,resnorm,exitflag];
end
fprintf(1,['\n1D, n_g=' num2str(num_g) '\n']);
toc
% plot first 10 to check
figure(1);clf;hold on;set(gcf,'position',[ 680 134 907 844]);
subplot(311);hold on;
title('1D');
xx_dense=[min(xx_mat):0.1:max(xx_mat)];
for ii=1:min(num_g,10)
ph=plot(xx_mat(:,ii),yy_mat(:,ii),'.'); % data
plot(xx_dense,fun_1D(p_guess(ii,1:3),xx_dense),':','color',get(ph,'color')); % guess
plot(xx_dense,fun_1D(pfit(ii,1:3),xx_dense),'-','color',get(ph,'color')); % fit
end
plot(xx_mat(:,1:maxind),yy_mat(:,1:maxind),'.','color',0*[1, 1, 1]);hold on;
xlabel('xx [pixel]');
ylabel('yy [intensity]');
%% ND without Jacobian multiplication function
fun_ND=@(x,xdata)x(:,3)*ones(1,size(xdata,2)).*exp(- (((xdata - (x(:,1)*ones(1,size(xdata,2))))./ (x(:,2)*ones(1,size(xdata,2)))).^2)/2 ); % gaussian model
opts1=optimset('display','off');
p0_ND=nan*ones(num_g,3); % guess
p0_ND=p_guess; % same as 1D
pfit_ND=nan*ones(num_g,3+2); % fit results
tic
[pfit_ND,resnorm,residual,exitflag,output]=lsqcurvefit(fun_ND,p0_ND,xx_mat',yy_mat',[],[],opts1); % each row is 1 object
fprintf(1,['\nND, no J supplied, n_g=' num2str(num_g) '\n']);
toc
% plot first 10 to check
figure(1);
subplot(312);hold on;
title('ND');
yy_fit=fun_ND(pfit_ND,xx_mat'); yy_fit=yy_fit';
yy_guess=fun_ND(p0_ND,xx_mat');yy_guess=yy_guess';
plot(xx_mat(:,1:maxind),yy_guess(:,1:maxind),':','color',0.5*[1, 1, 1]);
plot(xx_mat(:,1:maxind),yy_fit(:,1:maxind),'-');
plot(xx_mat(:,1:maxind),yy_mat(:,1:maxind),'.','color',0*[1, 1, 1]);hold on;
xlabel('xx [pixel]');
ylabel('yy [intensity]');
%% ND fit with Jacobian multiplication function
global G_f G_P G_n G_k;
opts2 = optimoptions('lsqcurvefit','display','iter','SpecifyObjectiveGradient',true,'JacobianMultiplyFcn',@(Jinfo,Y,flag,xdata)myJ(Jinfo,Y,flag));
p0_ND_J=nan*ones(num_g,3); % guess
p0_ND_J=p_guess; % same as 1D
pfit_ND_J=nan*ones(num_g,3+2); % fit results
p0_ND_J_vec=[p0_ND_J(:,1) p0_ND_J(:,2) p0_ND_J(:,3)]; % vectorized to define order: all mu, then all sigma, then all a
tic
[pfit_ND_J,resnorm,residual,exitflag,output]=lsqcurvefit(@fun_ND_withJ, p0_ND_J_vec, xx_mat', yy_mat',[],[],opts2); % each row is 1 object
fprintf(1,['\nND, J multiplication supplied, n_g=' num2str(num_g) '\n']);
toc
% plot to check
figure(1);
subplot(313);hold on;
title('ND with Jacobian');
maxind=min(10,num_g);
yy_fit=fun_ND_withJ(pfit_ND_J,xx_mat');yy_fit=yy_fit';
yy_guess=fun_ND_withJ(p0_ND_J,xx_mat');yy_guess=yy_guess';
plot(xx_mat(:,1:maxind),yy_guess(:,1:maxind),':','color',0.5*[1, 1, 1]);hold on;
plot(xx_mat(:,1:maxind),yy_fit(:,1:maxind),'-');
plot(xx_mat(:,1:maxind),yy_mat(:,1:maxind),'.','color',0*[1, 1, 1]);hold on;
xlabel('xx [pixel]');
ylabel('yy [intensity]');
% model for ND fit with Jacobian multiplication
% x is num_g x num_params_per_fit (num_g x 3)
% xdata is num_g x num_points_per_fit (num_g x 100)
% f is num_g x num_points_per_fit (num_g x 100)
function [f,Jinfo] = fun_ND_withJ(x,xdata)
global G_f G_P G_n G_k G_xd;
% update globals
G_P=x;
G_xd=xdata;
G_n=size(xdata,2); % number of data points
G_k=size(xdata,1); % number of fits
% Jinfo sparse matrix for preconditioning - required input.
% size should equal the size of the Jacobian C.
Jinfo = speye(G_n*G_k, size(x,1)*G_k);
f=x(:,3)*ones(1,size(xdata,2)).*exp(- (((xdata - (x(:,1)*ones(1,size(xdata,2))))./ (x(:,2)*ones(1,size(xdata,2)))).^2)/2 ); % gaussian model
G_f=f;
end
% compute Jacobian multiply functions.
% note: Jacobian C is sparse. the only nonzero elements are
% derivatives of points xi belonging to fit j
% with respect to their own fitting parameters [mj,sj,aj].
% for xi "connected" to params Pj=[mj,sj,aj], we have:
% for mu==mj: dxi/dmj = -fi*(xi-mj)/sj^2
% for sigma==sj: dxi/dsj = fi*(xi-mj)^2/sj^3
% for a==aj: dxi/daj = fi/aj
% we need f=f(xdata(1);xdata(2);...;xdata(num_g);P) and parameters Pj for all j
function w = myJ(Jinfo,Y,flag)
global G_f G_P G_n G_k G_xd;
P=G_P;
f=G_f;
n=G_n;
k=G_k;
xd=G_xd;
if flag > 0
w = Jpositive(Y,f,P,n,k,xd);
elseif flag < 0
w = Jnegative(Y,f,P,n,k,xd);
else
w1= Jpositive(Y,f,P,n,k,xd);
w = Jnegative(w1,f,P,n,k,xd);
end
function a = Jpositive(q,f,P,n,k,xd)
% Calculate C*q (dimensions NxP, Px1 => Nx1)
a = zeros(prod(size(f)),1);
for ff=1:k % loop over fits, calculate n sums for each fit
fp=f(ff,:); % f of points for this fit (n)
xp=xd(ff,:); % points for this fit (n)
p=P(ff,:); % params for this fit (m)
qp=q([ff + (size(P,1))*[0, 1, 2]] ); % part of input vector q that contributes (m)
a([(n*(ff-1)+1):n*ff])=fp.*( -qp(1)*(xp-p(1))/p(2)^2 + qp(2)*(xp-p(1)).^2/p(2)^3+ qp(3)/p(3)); % enter n derivatives
end
end
function a = Jnegative(q,f,P,n,k,xd)
% Calculate C'*q (dimensions PxN, Nx1 => Px1)
a = zeros(prod(size(P)),1); % Px1
for ff=1:k % loop over fits, calculate 3 sums for 3 fit parameters
p=P(ff,:); % params for this fit (m)
fp=f(ff,:); % points for this fit (n)
xp=xd(ff,:); % points for this fit (n)
qp=q(n*(ff-1)+1:n*ff); % part of input vector that contributes (n)
% enter m derivatives
a(ff)= -sum( fp(:).*qp(:).*(xp(:)-p(1))) /p(2)^2; % mus, at positions [1:k]
a(ff+k)= sum( fp(:).*qp(:).*(xp(:)-p(1)).^2)/p(2)^3; % sigmas, at [k+1:2*k]
a(ff+2*k)=sum( fp(:).*qp(:)) /p(3); % as, at [2*k+1:3*k]
end
end
end
Matt J
Matt J el 3 de Abr. de 2019
A mistake in the Jacobian implementation, I would guess. Maybe use the CheckGradients option to verify.
sarah goldberg
sarah goldberg el 4 de Abr. de 2019
I rechecked, fixed a minus, and reran (update code below).
1) For more than 1 Gaussian, it is still definitely off, so I assume I have a mistake with the indexing (mixing of derivatives between fits). Is there a way to get the residual of the fit at each iteration? It is hard to compare the ND to the loop-over-1D result without that information.
2) For a single gaussian, the fit looks ok visually
but the numbers look different (1D, ND, ND + Jacobian). Should I expect the convergence to be faster with the multiplication function supplied?
Norm of First-order
Iteration Func-count f(x) step optimality
0 4 84.1924 37.1
1 8 1.6395 1.90159 2.76
2 12 0.00633328 0.609387 0.405
3 16 5.7509e-09 0.0147914 0.000164
4 20 6.22034e-20 3.6403e-05 1.27e-09
Norm of First-order
Iteration Func-count f(x) step optimality
0 4 84.1924 37.1
1 8 1.6395 1.90159 2.76
2 12 0.00633328 0.609387 0.405
3 16 5.7509e-09 0.0147914 0.000164
4 20 6.22034e-20 3.6403e-05 1.27e-09
Norm of First-order
Iteration Func-count f(x) step optimality
0 1 84.1924 37.1
1 2 4.17 1.8473 3.69
2 3 0.479198 0.924628 3.4
3 4 0.015558 0.12554 0.237
4 5 0.000387441 0.0587371 0.0974
5 6 8.56577e-06 0.00357684 0.00558
6 7 1.86572e-07 0.00138134 0.00214
7 8 4.05201e-09 7.85018e-05 0.000121
3) I also saw that the Jacobian returned by lsqcurvefit is of the wrong dimension. For the 1D and ND, I get the expected dimension
size(jacobian)=(100,3)
and for the ND+Jacobian case I get
size(jacobian)=(100,1)
Is this part of the same problem? Or is it because the full Jacobian is not evaluated?
Again, Thanks.
Sarah
(most recent version of the code)
clear all; close all; fclose all;
% model is a*exp(-0.5*((x-m)/s)^2), with different a, m, s for each row
num_g=1; % number of gaussians (rows)
maxind=min(10,num_g); % number of gussians we will plot
a0=4;
a_std=5;
mu0=40;
mu_std=30;
sigma0=3;
sigma_std=1;
xx=[1:1:100]; % these are the pixel values at which we will have data
alg= 'trust-region-reflective' ; % 'trust-region-reflective' or 'levenberg-marquardt'
displaySetting='off'; % 'iter', 'off', etc.
% % generate random data
% mus=mu0 + mu_std*randn(1,num_g);
% sigmas=sigma0 + sigma_std*randn(1,num_g);
% as=a0 + a_std*randn(1,num_g);
% sigmas(sigmas==0)=sigma0; % to avoid division by 0
% generate ordered data (for debugging)
mus=mu0 + mu_std*[1:num_g]/num_g;
sigmas=sigma0 + sigma_std*[1:num_g]/num_g;
as=a0 + a_std*[1:num_g]/num_g;
% 1D data - each column is a different object
mu_mat=ones(size(xx,2),1)*mus;
sigma_mat=ones(size(xx,2),1)*sigmas;
a_mat=ones(size(xx,2),1)*as;
xx_mat=xx'*ones(1,size(mus,2));
yy_mat=a_mat.*exp( - (((xx_mat-mu_mat)./sigma_mat).^2)/2 );
%% 1D fit, looped over rows (for comparison)
fun_1D=@(x,xdata)x(3)*exp( - (((xdata-x(1))./x(2)).^2)/2 ); % gaussian model
opts1=optimset('display',displaySetting,'Algorithm',alg);
p_guess=nan*ones(num_g,3); % guess
pfit=nan*ones(num_g,3+2); % fit results
% find starting points==guesses
for ii=1:num_g
[maxval,maxpos]=max(yy_mat(:,ii));
[~,minpos]=min(abs(yy_mat(:,ii) - maxval/2));
p_guess(ii,:)=[xx_mat(maxpos,ii),abs(minpos-maxpos)+1,maxval];
end
% fit
tic
for ii=1:num_g
[x,resnorm,residual,exitflag,output1,lambda1,jacobian1]=lsqcurvefit(fun_1D,p_guess(ii,:),xx_mat(:,ii)',yy_mat(:,ii)',[],[],opts1);
pfit(ii,:)=[x,resnorm,exitflag];
end
fprintf(1,['\n1D, n_g=' num2str(num_g) '\n']);
toc
% plot first 10 to check
figure(1);clf;hold on;set(gcf,'position',[ 680 134 907 844]);
subplot(311);hold on;
title('1D');
xx_dense=[min(xx_mat):0.1:max(xx_mat)];
for ii=1:min(num_g,10)
ph=plot(xx_mat(:,ii),yy_mat(:,ii),'.'); % data
plot(xx_dense,fun_1D(p_guess(ii,1:3),xx_dense),':','color',get(ph,'color')); % guess
plot(xx_dense,fun_1D(pfit(ii,1:3),xx_dense),'-','color',get(ph,'color')); % fit
end
plot(xx_mat(:,1:maxind),yy_mat(:,1:maxind),'.','color',0*[1, 1, 1]);hold on;
xlabel('xx [pixel]');
ylabel('yy [intensity]');
%% ND without Jacobian multiplication function
fun_ND=@(x,xdata)x(:,3)*ones(1,size(xdata,2)).*exp(- (((xdata - (x(:,1)*ones(1,size(xdata,2))))./ (x(:,2)*ones(1,size(xdata,2)))).^2)/2 ); % gaussian model
opts1=optimset('display',displaySetting,'Algorithm',alg);
p0_ND=nan*ones(num_g,3); % guess
p0_ND=p_guess; % same as 1D
pfit_ND=nan*ones(num_g,3+2); % fit results
tic
[pfit_ND,resnormN,residualN,exitflagN,outputN,lambdaN,jacobianN]=lsqcurvefit(fun_ND,p0_ND,xx_mat',yy_mat',[],[],opts1); % each row is 1 object
fprintf(1,['\nND, no J supplied, n_g=' num2str(num_g) '\n']);
toc
% plot first 10 to check
figure(1);
subplot(312);hold on;
title('ND');
yy_fit=fun_ND(pfit_ND,xx_mat'); yy_fit=yy_fit';
yy_guess=fun_ND(p0_ND,xx_mat');yy_guess=yy_guess';
plot(xx_mat(:,1:maxind),yy_guess(:,1:maxind),':','color',0.5*[1, 1, 1]);
plot(xx_mat(:,1:maxind),yy_fit(:,1:maxind),'-');
plot(xx_mat(:,1:maxind),yy_mat(:,1:maxind),'.','color',0*[1, 1, 1]);hold on;
xlabel('xx [pixel]');
ylabel('yy [intensity]');
%% ND fit with Jacobian multiplication function
global G_f G_P G_n G_k;
opts2 = optimoptions('lsqcurvefit','display',displaySetting,'Algorithm',alg,'tolfun',1e-15,'SpecifyObjectiveGradient',true,'JacobianMultiplyFcn',@(Jinfo,Y,flag,xdata)myJ(Jinfo,Y,flag));
p0_ND_J=nan*ones(num_g,3); % guess
p0_ND_J=p_guess; % same as 1D
pfit_ND_J=nan*ones(num_g,3+2); % fit results
p0_ND_J_vec=[p0_ND_J(:,1) p0_ND_J(:,2) p0_ND_J(:,3)]; % vectorized to
%define order: all mu, then all sigma, then all a. can be entered as guess
%instead of p0_ND_J, does not make a difference
tic
[pfit_ND_J,resnorm,residual,exitflag,outputJM,lambdaJM,jacobianJM]=lsqcurvefit(@fun_ND_withJ, p0_ND_J_vec, xx_mat', yy_mat',[],[],opts2); % each row is 1 object
fprintf(1,['\nND, J multiplication supplied, n_g=' num2str(num_g) '\n']);
toc
% plot to check
figure(1);
subplot(313);hold on;
title('ND with Jacobian');
maxind=min(10,num_g);
yy_fit=fun_ND_withJ(pfit_ND_J,xx_mat');yy_fit=yy_fit';
yy_guess=fun_ND_withJ(p0_ND_J,xx_mat');yy_guess=yy_guess';
plot(xx_mat(:,1:maxind),yy_guess(:,1:maxind),':','color',0.5*[1, 1, 1]);hold on;
plot(xx_mat(:,1:maxind),yy_fit(:,1:maxind),'-');
plot(xx_mat(:,1:maxind),yy_mat(:,1:maxind),'.','color',0*[1, 1, 1]);hold on;
xlabel('xx [pixel]');
ylabel('yy [intensity]');
% model for ND fit with Jacobian multiplication
% x is num_g x num_params_per_fit (num_g x 3)
% xdata is num_g x num_points_per_fit (num_g x 100)
% f is num_g x num_points_per_fit (num_g x 100)
function [f,Jinfo] = fun_ND_withJ(x,xdata)
global G_f G_P G_n G_k G_xd;
% update globals
G_P=x;
G_xd=xdata;
G_n=size(xdata,2); % number of data points
G_k=size(xdata,1); % number of fits
% Jinfo sparse matrix for preconditioning - required input.
% size should equal the size of the Jacobian C.
Jinfo = speye(G_n*G_k, size(x,1)*G_k);
%fun_ND=@(x,xdata)x(:,3)*ones(1,size(xdata,2)).*exp(- (((xdata - (x(:,1)*ones(1,size(xdata,2))))./ (x(:,2)*ones(1,size(xdata,2)))).^2)/2 ); % gaussian model
f= x(:,3)*ones(1,size(xdata,2)).*exp(- (((xdata - (x(:,1)*ones(1,size(xdata,2))))./ (x(:,2)*ones(1,size(xdata,2)))).^2)/2 ); % gaussian model
G_f=f;
end
% compute Jacobian multiply functions.
% note: Jacobian C is sparse. the only nonzero elements are
% derivatives of points xi belonging to fit j
% with respect to their own fitting parameters [mj,sj,aj].
% for xi "connected" to params Pj=[mj,sj,aj], we have:
% for mu==mj: dfi/dmj = fi*(xi-mj)/sj^2
% for sigma==sj: dfi/dsj = fi*(xi-mj)^2/sj^3
% for a==aj: dfi/daj = fi/aj
% we need f=f(xdata(1);xdata(2);...;xdata(num_g);P) and parameters Pj for all j
function w = myJ(Jinfo,Y,flag)
global G_f G_P G_n G_k G_xd;
P=G_P;
f=G_f;
n=G_n;
k=G_k;
xd=G_xd;
if flag > 0
%'pos'
w = Jpositive(Y,f,P,n,k,xd);
elseif flag < 0
%'neg'
w = Jnegative(Y,f,P,n,k,xd);
else
%'comp'
w1= Jpositive(Y,f,P,n,k,xd);
w = Jnegative(w1,f,P,n,k,xd);
end
function a = Jpositive(q,f,P,n,k,xd)
% Calculate C*q (dimensions NxP, Px1 => Nx1)
a = zeros(prod(size(f)),1);
for ff=1:k % loop over fits, calculate n sums for each fit
fp=f(ff,:); % f of points for this fit (n)
xp=xd(ff,:); % points for this fit (n)
p=P(ff,:); % params for this fit (m)
qp=q([ff + (size(P,1))*[0, 1, 2]] ); % part of input vector q that contributes (m)
a([(n*(ff-1)+1):n*ff])=fp.*( qp(1)*(xp-p(1))/p(2)^2 + qp(2)*(xp-p(1)).^2/p(2)^3+ qp(3)/p(3)); % enter n derivatives
end
end
function a = Jnegative(q,f,P,n,k,xd)
% Calculate C'*q (dimensions PxN, Nx1 => Px1)
a = zeros(prod(size(P)),1); % Px1
for ff=1:k % loop over fits, calculate 3 sums for 3 fit parameters
p=P(ff,:) ; % params for this fit (m)
fp=f(ff,:); % points for this fit (n)
xp=xd(ff,:); % points for this fit (n)
qp=q(n*(ff-1)+1:n*ff); % part of input vector that contributes (n)
% enter m derivatives
a(ff)= sum( fp(:).*qp(:).*(xp(:)-p(1))) /p(2)^2; % mus, at positions [1:k]
a(ff+k)= sum( fp(:).*qp(:).*(xp(:)-p(1)).^2)/p(2)^3; % sigmas, at [k+1:2*k]
a(ff+2*k)=sum( fp(:).*qp(:)) /p(3); % as, at [2*k+1:3*k]
end
end
end
Matt J
Matt J el 4 de Abr. de 2019
Editada: Matt J el 4 de Abr. de 2019
Is there a way to get the residual of the fit at each iteration?
Use the OutputFcn option as in this example.
Should I expect the convergence to be faster with the multiplication function supplied?
Everything should be at least as good with the multiplication function supplied.
Is this part of the same problem? Or is it because the full Jacobian is not evaluated?
The latter, I imagine.
sarah goldberg
sarah goldberg el 11 de Abr. de 2019
Hi Matt,
I got the multiple-Gaussian example with Jacobian multiply function working (code below) and think the example might be useful to others. Feel free to edit the Q and A.
My only issues, please comment:
1) The error is larger when the Jacobian multiply function is supplied. I do not think there is an error in the derivative or in the indexing.
2) The time for the Jacobian multiply is slower than for supplying the exact sparse Jacobian.
3) What would you recommend for N fits that do not have the same number of data points?
Thanks for your help!
Sarah
Visual fits for the different options (looped 1D, looped 1D+Jacobian, ND, ND + Jacobian, ND + Jacobian multiply function)
Capture.PNG
Times
Elapsed time is 0.467018 seconds.
1D, n_g=20
resnorm1 =
0
Elapsed time is 0.180804 seconds.
1D + J, n_g=20
resnormJ =
0
Elapsed time is 0.094740 seconds.
ND, n_g=20
resnormN =
1.1678e-30
Elapsed time is 0.033412 seconds.
ND+J, n_g=20
resnormNJ =
9.0389e-31
Elapsed time is 0.319316 seconds.
ND + J mult. func., n_g=20
resnormJM =
2.1279e-10
code
clear all; close all; fclose all; clc;
global history; % for storing fit data
% model params. model is
% a*exp(-0.5*((x-m)/s)^2)
% with different a, m, s for each row
num_g=20; % number of gaussians (rows)
maxind=min(10,num_g); % number of gussians we will plot
npoints=30;
a0=1;
a_std=5;
mu0=npoints/4;
mu_std=0.6*npoints;
sigma0=1;
sigma_std=2;
xx=[1:1:npoints]; % these are the pixel values at which we will have data
% fit params
alg='trust-region-reflective' ; % 'trust-region-reflective' or 'levenberg-marquardt'
displaySetting='off'; % 'iter', 'off', 'final', etc.
maxIter=1e3; % for debugging
maxFunEvals=1e3;
tolfun=1e-10;
% % generate random data
% mus=mu0 + mu_std*randn(1,num_g);
% sigmas=sigma0 + sigma_std*randn(1,num_g);
% as=a0 + a_std*randn(1,num_g);
% sigmas(sigmas==0)=sigma0; % to avoid division by 0
% generate ordered data (for debugging)
mus=mu0 + mu_std*[1:num_g]/num_g;
sigmas=sigma0 + sigma_std*[1:num_g]/num_g;
as=a0 + a_std*[1:num_g]/num_g;
% 1D data - each column is a different object
mu_mat=ones(size(xx,2),1)*mus;
sigma_mat=ones(size(xx,2),1)*sigmas;
a_mat=ones(size(xx,2),1)*as;
xx_mat=xx'*ones(1,size(mus,2));
yy_mat=a_mat.*exp( - (((xx_mat-mu_mat)./sigma_mat).^2)/2 );
%% 1D fit, looped over rows (for comparison)
opts1=optimoptions('lsqcurvefit','display',displaySetting,'Algorithm',alg,'tolfun',tolfun,'MaxIter',maxIter,'MaxFunEvals',maxFunEvals,'OutputFcn',@myoutput);
p_guess=nan*ones(num_g,3); % guess
pfit=nan*ones(num_g,3+2); % fit results
% find starting points==guesses
for ii=1:num_g
[maxval,maxpos]=max(yy_mat(:,ii));
[~,minpos]=min(abs(yy_mat(:,ii) - maxval/2));
p_guess(ii,:)=[xx_mat(maxpos,ii),abs(minpos-maxpos)+1,maxval];
end
history=[];
history1=history;
% fit
tic
for ii=1:num_g
[x1,resnorm1,residual1,exitflag1,output1,lambda1,jacobian1]=lsqcurvefit(@fun_1D,p_guess(ii,:),xx_mat(:,ii)',yy_mat(:,ii)',[],[],opts1);
pfit(ii,:)=[x1,resnorm1,exitflag1];
history1=[history1; history];
end
toc
fprintf(1,['1D, n_g=' num2str(num_g) '\n']);
resnorm1
% plot first 10 to check
figure(1);clf;hold on;set(gcf,'position',[ 680 134 907 844]);
subplot(511);hold on;
title('1D');
xx_dense=[min(xx_mat):0.1:max(xx_mat)];
for ii=1:min(num_g,10)
ph=plot(xx_mat(:,ii),yy_mat(:,ii),'.'); % data
plot(xx_dense,fun_1D(p_guess(ii,1:3),xx_dense),':','color',get(ph,'color')); % guess
plot(xx_dense,fun_1D(pfit(ii,1:3),xx_dense),'-','color',get(ph,'color')); % fit
end
plot(xx_mat(:,1:maxind),yy_mat(:,1:maxind),'.','color',0*[1, 1, 1]);hold on;
xlabel('xx [pixel]');
ylabel('yy [intensity]');
% find starting points==guesses
for ii=1:num_g
[maxval,maxpos]=max(yy_mat(:,ii));
[~,minpos]=min(abs(yy_mat(:,ii) - maxval/2));
p_guess(ii,:)=[xx_mat(maxpos,ii),abs(minpos-maxpos)+1,maxval];
end
%% 1D + J
opts1J=optimoptions('lsqcurvefit','SpecifyObjectiveGradient',true,'tolfun',tolfun,'Jacobian','on','display',displaySetting,'Algorithm',alg,'MaxIter',maxIter,'MaxFunEvals',maxFunEvals,'OutputFcn',@myoutput);
history=[];
history1J=history;
fprintf(1,'\n');
% fit
tic
for ii=1:num_g
[x,resnormJ,residual,exitflagJ,output1J,lambda1J,jacobian1J]=lsqcurvefit(@fun_1D_J,p_guess(ii,:),xx_mat(:,ii)',yy_mat(:,ii)',[],[],opts1J);
pfit(ii,:)=[x,resnormJ,exitflagJ];
history1J=[history1J; history];
end
toc
fprintf(1,['1D + J, n_g=' num2str(num_g) '\n']);
resnormJ
% plot first 10 to check
figure(1);hold on;set(gcf,'position',[ 680 134 907 844]);
subplot(512);hold on;
title('1D + J');
xx_dense=[min(xx_mat):0.1:max(xx_mat)];
for ii=1:min(num_g,10)
ph=plot(xx_mat(:,ii),yy_mat(:,ii),'.'); % data
plot(xx_dense,fun_1D_J(p_guess(ii,1:3),xx_dense),':','color',get(ph,'color')); % guess
plot(xx_dense,fun_1D_J(pfit(ii,1:3),xx_dense),'-','color',get(ph,'color')); % fit
end
plot(xx_mat(:,1:maxind),yy_mat(:,1:maxind),'.','color',0*[1, 1, 1]);hold on;
xlabel('xx [pixel]');
ylabel('yy [intensity]');
%% ND without Jacobian or Jacobian multiplication function
optsN=optimoptions('lsqcurvefit','display',displaySetting,'Algorithm',alg,'tolfun',tolfun,'MaxIter',maxIter,'MaxFunEvals',maxFunEvals,'OutputFcn', @myoutput);
p0_ND=nan*ones(num_g,3); % guess
p0_ND=p_guess; % same as 1D
pfit_ND=nan*ones(num_g,3+2); % fit results
history=[];
fprintf(1,'\n');
tic
[pfit_ND,resnormN,residualN,exitflagN,outputN,lambdaN,jacobianN]=lsqcurvefit(@fun_ND,p0_ND,xx_mat',yy_mat',[],[],optsN); % each row is 1 object
toc
fprintf(1,['ND, n_g=' num2str(num_g) '\n']);
resnormN
historyN=history;
% plot first 10 to check
figure(1);
subplot(513);hold on;
title('ND');
yy_fit=fun_ND(pfit_ND,xx_mat'); yy_fit=yy_fit';
yy_guess=fun_ND(p0_ND,xx_mat');yy_guess=yy_guess';
plot(xx_mat(:,1:maxind),yy_guess(:,1:maxind),':','color',0.5*[1, 1, 1]);
plot(xx_mat(:,1:maxind),yy_fit(:,1:maxind),'-');
plot(xx_mat(:,1:maxind),yy_mat(:,1:maxind),'.','color',0*[1, 1, 1]);hold on;
xlabel('xx [pixel]');
ylabel('yy [intensity]');
%% ND with Jacobian
optsNJ=optimoptions('lsqcurvefit','SpecifyObjectiveGradient',true,'Jacobian','on','display',displaySetting,'Algorithm',alg,'tolfun',tolfun,'MaxIter',maxIter,'MaxFunEvals',maxFunEvals,'OutputFcn',@myoutput);
p0_ND_J=nan*ones(num_g,3); % guess
p0_ND_J=p_guess; % same as 1D
pfit_ND_J=nan*ones(num_g,3+2); % fit results
history=[];
fprintf(1,'\n');
tic
[pfit_ND_J,resnormNJ,residualNJ,exitflagNJ,outputNJ,lambdaNJ,jacobianNJ]=lsqcurvefit(@fun_ND_J,p0_ND_J,xx_mat',yy_mat',[],[],optsNJ); % each row is 1 object
toc
fprintf(1,['ND+J, n_g=' num2str(num_g) '\n']);
resnormNJ
historyNJ=history;
% plot first 10 to check
figure(1);
subplot(514);hold on;
title('ND+J');
yy_fit=fun_ND_J(pfit_ND_J,xx_mat'); yy_fit=yy_fit';
yy_guess=fun_ND_J(p0_ND_J,xx_mat');yy_guess=yy_guess';
plot(xx_mat(:,1:maxind),yy_guess(:,1:maxind),':','color',0.5*[1, 1, 1]);
plot(xx_mat(:,1:maxind),yy_fit(:,1:maxind),'-');
plot(xx_mat(:,1:maxind),yy_mat(:,1:maxind),'.','color',0*[1, 1, 1]);hold on;
xlabel('xx [pixel]');
ylabel('yy [intensity]');
%% ND fit with Jacobian multiplication function
opts2 = optimoptions('lsqcurvefit','MaxIter',maxIter,'MaxFunEvals',maxFunEvals,'OutputFcn', @myoutput,'display',displaySetting,'Algorithm',alg,'tolfun',tolfun,'SpecifyObjectiveGradient',true,'JacobianMultiplyFcn',@(Jinfo,Y,flag,xdata)myJ(Jinfo,Y,flag));
p0_ND_JM=nan*ones(num_g,3); % guess
p0_ND_JM=p_guess; % same as 1D
pfit_ND_JM=nan*ones(num_g,3+2); % fit results
p0_ND_J_vec=[p0_ND_J(:,1) p0_ND_J(:,2) p0_ND_J(:,3)]; % vectorized to
%define order: all mu, then all sigma, then all a. can be entered as guess
%instead of p0_ND_J, does not make a difference
history=[];
fprintf(1,'\n');
tic
[pfit_ND_JM,resnormJM,residualJM,exitflagJM,outputJM,lambdaJM,jacobianJM]=lsqcurvefit(@fun_ND_Jmult, p0_ND_JM, xx_mat', yy_mat',[],[],opts2); % each row is 1 object
toc
fprintf(1,['ND + J mult. func., n_g=' num2str(num_g) '\n']);
resnormJM
historyNJmult=history;
% plot to check
figure(1);
subplot(515);hold on;
title('ND + Jacobian Multiply');
maxind=min(10,num_g);
yy_fit=fun_ND_Jmult(pfit_ND_J,xx_mat');yy_fit=yy_fit';
yy_guess=fun_ND_Jmult(p0_ND_J,xx_mat');yy_guess=yy_guess';
plot(xx_mat(:,1:maxind),yy_guess(:,1:maxind),':','color',0.5*[1, 1, 1]);hold on;
plot(xx_mat(:,1:maxind),yy_fit(:,1:maxind),'-');
plot(xx_mat(:,1:maxind),yy_mat(:,1:maxind),'.','color',0*[1, 1, 1]);hold on;
xlabel('xx [pixel]');
ylabel('yy [intensity]');
%%%%%%% functions below %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% 1D
function [f]=fun_1D(x,xdata)
f=x(3)*exp( - (((xdata-x(1))./x(2)).^2)/2 ); % gaussian model
end
% 1D + jacobian
function [f,J]=fun_1D_J(x,xdata)
G_n=size(xdata,2); % number of data points
G_k=size(xdata,1); % number of fits
f = x(3)*exp( - (((xdata-x(1))./x(2)).^2)/2 ); % gaussian model
J = spalloc(G_n*G_k, size(x,1)*G_k, G_n*G_k*3);
for ff=1:G_k % loop over fits, calculate n sums for each fit
fp=f(ff,:); % f of points for this fit (n)
xp=xdata(ff,:); % points for this fit (n)
p=x(ff,:); % params for this fit (m)
J( ff:G_k:end ,1)=fp.*(xp-p(1))/p(2)^2;
J( ff:G_k:end ,2)=fp.*(xp-p(1)).^2/p(2)^3;
J( ff:G_k:end ,3)=fp/p(3);
end
end
% ND
function [f]=fun_ND(x,xdata)
f=x(:,3)*ones(1,size(xdata,2)).*exp(- (((xdata - (x(:,1)*ones(1,size(xdata,2))))./ (x(:,2)*ones(1,size(xdata,2)))).^2)/2 ); % gaussian model
end
% ND + jacobian
function [f,J]=fun_ND_J(x,xdata)
f=x(:,3)*ones(1,size(xdata,2)).*exp(- (((xdata - (x(:,1)*ones(1,size(xdata,2))))./ (x(:,2)*ones(1,size(xdata,2)))).^2)/2 ); % gaussian model
G_n=size(xdata,2); % number of data points
G_k=size(xdata,1); % number of fits
G_p=numel(x)/G_k; % number of params per fit
J = spalloc(G_n*G_k, G_p*G_k, G_n*G_k*3);
for ff=1:G_k % loop over fits, calculate 3*n derivatives for each fit
fp=f(ff,:); % f of points for this fit (n)
xp=xdata(ff,:); % points for this fit (n)
p=x(ff,:); % params for this fit (m)
J(ff:G_k:end, ff)=fp.*(xp-p(1))/p(2)^2; % df/dmu (may be negative)
J(ff:G_k:end, G_k+ ff)=fp.*(xp-p(1)).^2/p(2)^3; % df/dsigma (non-negative)
J(ff:G_k:end, 2*G_k+ ff)=fp/p(3); % df/da (non-negative)
end
end
% model for ND fit with Jacobian multiplication
% x is num_g x num_params_per_fit (num_g x 3)
% xdata is num_g x num_points_per_fit (num_g x 100)
% f is num_g x num_points_per_fit (num_g x 100)
function [f,Jinfo] = fun_ND_Jmult(x,xdata)
global G_f G_P G_n G_k G_xd; % needed for jacobian multiply function
f=x(:,3)*ones(1,size(xdata,2)).*exp(- (((xdata - (x(:,1)*ones(1,size(xdata,2))))./ (x(:,2)*ones(1,size(xdata,2)))).^2)/2 ); % gaussian model
% update globals
G_P=x;
G_xd=xdata;
G_n=size(xdata,2); % number of data points
G_k=size(xdata,1); % number of fits
G_p=numel(x)/G_k; % number of params per fit
G_f=f;
% Jinfo sparse matrix for preconditioning, required even when jacobian
% multiply is used. size should equal the size of the jacobian.
Jinfo = spalloc(G_n*G_k, G_p*G_k, G_n*G_k*3); % all zeros
end
% compute Jacobian multiply functions.
% note: Jacobian C is sparse. the only nonzero elements are
% derivatives of points xi belonging to fit j
% with respect to their own fitting parameters [mj,sj,aj].
% for xi "connected" to params Pj=[mj,sj,aj], we have:
% for mu==mj: dfi/dmj = fi*(xi-mj)/sj^2
% for sigma==sj: dfi/dsj = fi*(xi-mj)^2/sj^3
% for a==aj: dfi/daj = fi/aj
% we need f=f(xdata(1);xdata(2);...;xdata(num_g);P) and parameters Pj for all j
function w = myJ(Jinfo,Y,flag)
global G_f G_P G_n G_k G_xd;
P=G_P;
f=G_f;
n=G_n;
k=G_k;
xd=G_xd;
if flag > 0
w = Jpositive(Y,f,P,n,k,xd);
elseif flag < 0
w = Jnegative(Y,f,P,n,k,xd);
else
w1= Jpositive(Y,f,P,n,k,xd);
w = Jnegative(w1,f,P,n,k,xd);
end
function a = Jpositive(q,f,P,n,k,xd)
% Calculate C*q (dimensions NxP, Px1 => Nx1)
a = zeros(prod(size(f)),1);
for ff=1:k % loop over fits, calculate n sums for each fit
fp=f(ff,:); % f of points for this fit (n)
xp=xd(ff,:); % points for this fit (n)
p=P(ff,:); % params for this fit (m)
qp=q([ff + k*[0:(numel(p)-1)]]); % part of input vector q that contributes (m)
a(ff:k:end)=fp.*( qp(1)*(xp-p(1))/p(2)^2 + qp(2)*(xp-p(1)).^2/p(2)^3+ qp(3)/p(3)); % enter n derivatives
end
end
function a = Jnegative(q,f,P,n,k,xd)
% Calculate C'*q (dimensions PxN, Nx1 => Px1)
a = zeros(prod(size(P)),1); % Px1
for ff=1:k % loop over fits, calculate 3 sums for 3 fit parameters
p=P(ff,:); % params for this fit (m)
fp=f(ff,:); % points for this fit (n)
xp=xd(ff,:); % points for this fit (n)
qp=q(ff:k:end); % part of input vector that contributes (n)
% enter m derivatives
a(ff)= sum(fp(:).*qp(:).*(xp(:)-p(1))) /p(2)^2 ; % mus, at positions [1:k]
a(ff+k)= sum(fp(:).*qp(:).*(xp(:)-p(1)).^2)/p(2)^3 ; % sigmas, at [k+1:2*k]
a(ff+2*k)=sum(fp(:).*qp(:)) /p(3) ; % as, at [2*k+1:3*k]
end
end
end
% append history if state=='iter'
function stop = myoutput(x,optimvalues,state)
global history;
stop = false;
if isequal(state,'iter')
history = [history; x];
end
end
Matt J
Matt J el 11 de Abr. de 2019
Editada: Matt J el 11 de Abr. de 2019
1) The error is larger when the Jacobian multiply function is supplied. I do not think there is an error in the derivative or in the indexing.
Assuming you're right, I would say the resnorm differences seen here are just "stopping noise": the iterations of the different computation methods take different paths toward the solution and stop wherever the stopping tolerances happen to be satisfied. The resnorm is not uniform over the range of these different possible stopping points, so you see essentially random differences in the resnorm between methods. Of course, it may be worth taking a look at the exitflag in each case...
I any event, it doesn't appear that JacobMult was ever capable of doing better in terms of accuracy than the default method. The worry that SpecifyObjectiveGradient tries to address, aside from speed, is that default finite difference calculations may limit the accuracy of the fit in some cases. But your tests show that that wasn't the case here. The finite differencing derivative calculations were good enough to reach an essentially zero resnorm, so there was no real room for improvement.
2) The time for the Jacobian multiply is slower than for supplying the exact sparse Jacobian.
Your Jacobian multiply code has various sub-efficiencies in it: for-looping, quantities like fp(:).*qp(:) or xp(:)-p(1) that are computed multiple times per k when they could be computed only once, subsref operations like fp=f(ff,:) which necessitate repeated memory allocation operations, etc...
3) What would you recommend for N fits that do not have the same number of data points?
You may find that it is not an issue once you start optimizing your computations and getting rid of your for-loops.
sarah goldberg
sarah goldberg el 6 de Mayo de 2019
Hi Matt,
I've been trying to generalize the ND Jacobian solution, with Jacobian not calculated/calculated/only Jacobian multiply calculated. The Jacobian multiply option is currently not working.
Here k is the number of 1D fits with n points and p parameters, N=n x k, P=p x k. Jacobian size is indeed (N x P) - I checked the output of lsqcurvefit.
I'm running into the following issue with the Jacobian multiply function (myJ): the size of the input vector Y is either (Nx1) or (Px1) or (Px2). How is this possible?
Code below. I am simply looking at
size(Y)
Thanks,
Sarah
clear all; close all; fclose all; clc;
s=rng('shuffle');
rng(828016308,'twister');
global history;
history=[];
% plot
doPlot=false; %true;
num_plot=4; % how many fits to plot
% fit params
alg='trust-region-reflective' ; % 'trust-region-reflective' or 'levenberg-marquardt'
displaySetting='off'; % 'iter', 'off', 'final', etc.
maxIter=1e5;
maxFunEvals=1e5;
tolfun=1e-6;
% options
opts = optimoptions('lsqcurvefit','MaxIter',maxIter,'MaxFunEvals',maxFunEvals,'OutputFcn', @myoutput,'display',displaySetting,'Algorithm',alg,'tolfun',tolfun);
optsJ = optimoptions('lsqcurvefit','MaxIter',maxIter,'MaxFunEvals',maxFunEvals,'OutputFcn', @myoutput,'display',displaySetting,'Algorithm',alg,'tolfun',tolfun,'Jacobian','on');
optsJm = optimoptions('lsqcurvefit','MaxIter',maxIter,'MaxFunEvals',maxFunEvals,'OutputFcn', @myoutput,'display',displaySetting,'Algorithm',alg,'tolfun',tolfun,'SpecifyObjectiveGradient',true,'JacobianMultiplyFcn',@(Jinfo,Y,flag,xdata)myJ(Jinfo,Y,flag));
% THE MODEL. generate function handles for evaluating f (f_func) and its partial derivatives (dfdp_func).
[f_func,dfdp_func,num_params]=sym_model_derivatives();
% general 2D model
num_g=2; % number of gaussians
npoints=[3,5]; % image size [r,c]=[y,x], ROI intentionally not symmetric for debugging
min_width=1; % so we don't have sigma==0
a0=30;
a_std=1;
mu_std=0.1;
sigma0=[2,1]; % be careful, do not make these too small!
sigma_std=0.1;
theta=0*(pi/180); % in radians
theta_std=10*(pi/180);
bg=10;
bg_std=5;
mu0=npoints/2; % place gaussians roughly at center
im=zeros(npoints);
% generate data: num_g images of size(im)
ims=zeros(num_g,size(im,1),size(im,2)); % image set size
mus=mu0'*ones(1,num_g) + mu_std*randn(2,num_g); % 1 per dimension
sigmas=sigma0'*ones(1,num_g) + sigma_std*randn(2,num_g); % 1 per dimension
as=a0 + a_std*randn(1,num_g); % 1 per gaussian
sigmas(sigmas<0.1)=min_width; % to avoid division by 0
thetas=theta+theta_std*randn(1,num_g);
bgs=bg+bg_std*randn(1,num_g);
[X,Y]=meshgrid(1:size(im,2),1:size(im,1)); % x-y coordinates - weird order so that data can be reshaped correctly
x_ind=X(:); % x, column vector
y_ind=Y(:); % y, column vector
for mm=1:num_g
% parameter order: %Amplitude % Row (y) % Col (x) % Theta -3* 3* % Sigma_row (y) % Sigma_col (x) % Background
c0s(mm,:)=[as(mm) mus(2,mm) mus(1,mm) thetas(mm) sigmas(2,mm) sigmas(1,mm) bgs(mm)];
ims(mm,:)=model_func(c0s(mm,:),x_ind,y_ind,f_func); % 1D vector per 2D plot
end
% guesses
guesses=zeros(num_g,num_params);
% parameter order: %Amplitude % Row % Col % Theta -3* 3* % Sigma_row % Sigma_col % Background
for mm=1:num_g
[maxval,~]=max(ims(mm,:));
tmp_im=reshape(ims(mm,:)',size(im,1),size(im,2));
[xpos,ypos]=find(tmp_im==maxval);
[~,half_x]=min(abs(tmp_im(ypos,:) - maxval/2));
[~,half_y]=min(abs(tmp_im(:,xpos) - maxval/2));
guesses(mm,:)=[maxval xpos ypos 0 (abs(half_x(1)-xpos)+min_width) (abs(half_y(1)-ypos)+min_width) 0]; % adding something small to avoid 0
end
%% N x 2D - setup
im_data=zeros(num_g,prod(size(im)));
xdata=kron(ones(1,num_g),x_ind)';
ydata=kron(ones(1,num_g),y_ind)';
guesses_ND=zeros(num_g,num_params);
% images
for mm=1:num_g
im_data(mm,:)=ims(mm,:);
end
% guesses
guesses_ND=guesses;
%% N x 2D
history=[];
results_ND=zeros(num_g,num_params);
fprintf(1,['\n']);
tic
f_ND = @(c,xdata)model_func_ND(c,xdata,ydata,f_func); % need to pass x_ind, y_ind to model_func
[results_ND,resnormN,residualN,exitflagN,outputN,lambdaN,jacobianN]=lsqcurvefit(f_ND,guesses_ND,xdata,im_data,[],[],opts); % each row is 1 object
toc
fprintf(1,['Nx2D\n']);
resnormN
historyN=history;
exitflagN
%% N x 2D + Jacobian
results_ND_J=zeros(num_g,num_params);
fprintf(1,['\n']);
tic
f_ND = @(c,xdata)model_func_ND_J(c,xdata,ydata,f_func,dfdp_func); % need to pass args x_ind, y_ind, etc. to model_func
[results_ND_J,resnormN_J,residualN_J,exitflagN_J,outputN_J,lambdaN_J,jacobianN_J]=lsqcurvefit(f_ND,guesses_ND,xdata,im_data,[],[],optsJ); % each row is 1 object
toc
fprintf(1,['Nx2D + Jacobian\n']);
resnormN_J
exitflagN_J
%% N x 2D + Jacobian multiply
history=[];
results_ND_J=zeros(num_g,num_params);
fprintf(1,['\n']);
tic
f_ND = @(c,xdata)model_func_ND_Jm(c,xdata,ydata,f_func,dfdp_func); % need to pass args x_ind, y_ind, etc. to model_func
[results_ND_Jm,resnormN_Jm,residualN_Jm,exitflagN_Jm,outputN_Jm,lambdaN_Jm,jacobianN_Jm]=lsqcurvefit(f_ND,guesses_ND,xdata,im_data,[],[],optsJm); % each row is 1 object
toc
fprintf(1,['Nx2D + Jacobian multiply\n']);
resnormN_Jm
historyNm=history;
exitflagN_Jm
return
%% ----- functions -------
% model:
% f = Amplitude*.*exp(- ( A*(x-Col).^2 + 2*B*(x-Col).*(y-Row) + C*(y-Row).^2 ) ) + Background
% A = cos(Theta)^2/(2*Sigma_col) + sin(Theta)^2/(2*Sigma_row)
% B = sin(2*Theta) * (1/Sigma_row^2 - 1/Sigma_col^2)/4;
% C = sin(Theta)^2/(2*Sigma_col) + cos(Theta)^2/(2*Sigma_row);
% parameter order: %Amplitude % Row % Col % Theta -3* 3* % Sigma_row % Sigma_col % Background
function [f_func,df,num_params]=sym_model_derivatives()
num_params=7;
p=sym('p',[1 num_params]);
syms f x y A B C tmp_df;
% model definition HERE
A=cos(p(4))^2/(2*p(6)^2) + sin(p(4))^2/(2*p(5)^2);
B=sin(2*p(4)) * (1/p(5)^2 - 1/p(6)^2)/4;
C=sin(p(4))^2/(2*p(6)^2) + cos(p(4))^2/(2*p(5)^2);
f = p(1)*exp(- ( A*(x-p(3))^2 + 2*B*(x-p(3))*(y-p(2)) + C*(y-p(2))^2 ) ) + p(7);
% end of model definition
vars=[p x y];
for ii=1:length(p)
tmp_df=diff(f,p(ii));
% convert dfdp to matlab function
df{ii}=matlabFunction(tmp_df,'vars',vars);
end
% convert f to matlab function
f_func=matlabFunction(f,'vars',vars);
end
% 2D model - used to generate data
function [f]=model_func(p,x,y,f_func)
p_list=num2cell(p); % to avoid p(1),p(2),...,p(n) argument passing
f=f_func(p_list{:},x,y);
end
% Nx2D model
function [f]=model_func_ND(c,x,y,f_func)
G_n=size(x,2); % number of data points
G_k=size(x,1); % number of fits
f = zeros(G_k, G_n);
for ff=1:G_k
xp=x(ff,:); % points for this fit (n)
yp=y(ff,:);
cp=c(ff,:); % params for this fit (m)
cp_list=num2cell(cp);
% f
f(ff,:)=f_func(cp_list{:},xp,yp);
end
end
% Nx2D + Jinfo model
function [f,Jinfo]=model_func_ND_info(c,x,y,f_func)
G_n=size(x,2); % number of data points
G_k=size(x,1); % number of fits
f = zeros(G_k, G_n);
for ff=1:G_k
xp=x(ff,:); % points for this fit (n)
yp=y(ff,:);
cp=c(ff,:); % params for this fit (m)
cp_list=num2cell(cp);
% f
f(ff,:)=f_func(cp_list{:},xp,yp);
end
G_p=numel(c)/G_k; % number of parameters per fit
Jinfo = spalloc(G_n*G_k, G_p*G_k, 0);
end
% Nx2D + Jacobian
function [f,J]=model_func_ND_J(c,x,y,f_func,dfdp_func)
G_n=size(x,2); % number of data points
G_k=size(x,1); % number of fits
G_p=numel(dfdp_func); % number of parameters per fit
f = zeros(G_k, G_n);
J = spalloc(G_n*G_k, G_p*G_k, G_n*G_k*G_p);
% calculate fit values and partial derivatives for each fit
for ff=1:G_k
xp=x(ff,:); % points for this fit (n)
yp=y(ff,:);
cp=c(ff,:); % params for this fit (m)
cp_list=num2cell(cp);
% f
f(ff,:)=f_func(cp_list{:},xp,yp);
% J
for pp=1:length(dfdp_func)
df=dfdp_func{pp}; % function handle of this partial derivative
J(ff:G_k:end , (pp-1)*G_k + ff) = df(cp_list{:},xp,yp);
% full(J)
end
%size(J)
end
end
% Nx2D + Jacobian multiply
function [f,Jinfo] = model_func_ND_Jm(c,xdata,ydata,f_func,dfdp_func)
global G_df_func G_P G_n G_k G_xd G_yd; % needed for jacobian multiply function
G_n=size(xdata,2); % number of data points
G_k=size(xdata,1); % number of fits
f = zeros(G_k, G_n);
for ff=1:G_k
xp=xdata(ff,:); % points for this fit (n)
yp=ydata(ff,:);
cp=c(ff,:); % params for this fit (m)
cp_list=num2cell(cp);
% f
f(ff,:)=f_func(cp_list{:},xp,yp);
end
% update globals
G_P=c; % parameters
G_p=numel(dfdp_func);% number of parameters per fit
G_xd=xdata; % indices
G_yd=ydata;
G_df_func=dfdp_func; % partial derivative function handles
% Jinfo sparse matrix for preconditioning, required even when jacobian
% multiply is used. size should equal the size of the jacobian.
Jinfo = spalloc(G_n*G_k, G_p*G_k, G_n*G_k*G_p); % all zeros
end
% compute Jacobian multiply functions.
% note: Jacobian C is sparse. the only nonzero elements are
% derivatives of points xi belonging to fit j
% with respect to their own fitting parameters. there positions within the
% jacobian can be viewed by looking at the jacobian generated in model_func_ND_J
% using the command full(J).
function w = myJ(Jinfo,Y,flag)
size(Y) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
global G_df_func G_P G_n G_k G_xd G_yd;
P=G_P;
n=G_n;
k=G_k;
xd=G_xd;
yd=G_yd;
df_func=G_df_func;
if flag > 0
w = Jpositive(Y, df_func,P,n,k,xd,yd);
elseif flag < 0
w = Jnegative(Y, df_func,P,n,k,xd,yd);
else
w1= Jpositive(Y, df_func,P,n,k,xd,yd);
w = Jnegative(w1,df_func,P,n,k,xd,yd);
end
function a = Jpositive(q,df_func,P,n,k,xd,yd)
% Calculate C*q (dimensions NxP, Px1 => Nx1)
a = zeros(n*k,1);
for ff=1:k % loop over fits, calculate n sums for each fit
xp=xd(ff,:); % points for this fit (n)
yp=yd(ff,:);
p=P(ff,:); % params for this fit (m)
qp=q(ff:k:end); % values of q for this fit (m)
sump=zeros(1,n);
for pp=1:numel(df_func)
df=df_func{pp}; % 1 of m function handles
p_list=num2cell(p);
dfp=df(p_list{:},xp,yp); % n partial derivatives (may have 1 value if derivative is a const)
sump=sump+qp(pp)*dfp; % add n terms for each of m params
end
a(ff:k:end)=sump'; % n terms
end
end
function a = Jnegative(q,df_func,P,n,k,xd,yd)
% Calculate C'*q (dimensions PxN, Nx1 => Px1)
a = zeros(prod(size(P)),1); % Px1
for ff=1:k % loop over fits, calculate 3 sums for 3 fit parameters
xp=xd(ff,:); % points for this fit (n)
yp=yd(ff,:);
p=P(ff,:); % params for this fit (m)
qp=q(ff:k:end); % part of input vector q that contributes for this fit (n)
for pp=1:numel(df_func)
df=df_func{pp}; % 1 of m function handles
p_list=num2cell(p);
dfp=df(p_list{:},xp,yp); % n partial derivatives
a(ff+(pp-1)*k)=sum(qp'.*dfp);
end
end
end
end
% append history if state=='iter', useful for checking convergence
function stop = myoutput(x,optimvalues,state)
global history;
stop = false;
if isequal(state,'iter')
history = [history; x];
end
end

Iniciar sesión para comentar.

Más respuestas (1)

Joss Knight
Joss Knight el 24 de Nov. de 2018
Editada: Joss Knight el 24 de Nov. de 2018

0 votos

It does rather depend on what you're doing. The functions polyfit and interp1 work with gpuArray inputs.

14 comentarios

Matt J
Matt J el 24 de Nov. de 2018
Editada: Matt J el 26 de Ag. de 2022
Joss,
Similar to the OP, I'd be curious to know if there are plans for gpuArray support in the Optimization Toolbox solvers (both the curve fitters and others). It seems like it should be easy to do. I'm pretty sure all the numeric operations used internally by PCT algorithms are already defined for gpuArray.
Joss Knight
Joss Knight el 24 de Nov. de 2018
I think you're right, although I'm not sure that curve fitting is commonly a performance bottleneck is it? Can you give an example of a typical workflow you hope to benefit from GPU support?
Matt J
Matt J el 24 de Nov. de 2018
Editada: Matt J el 24 de Nov. de 2018
People are always looking for more speed out of the Optimization Toolbox. This list of Answers posts and this present post are ready indications of that.
I think TMW already recogizes this. If curve fitting couldn't benefit significantly from parallel computing then why would lsqcurvefit already offer a UseParallel option? Why do all of the Optimization Toolbox solvers offer this? There's no reason to offer CPU acceleration, but not GPU acceleration, that I can see.
Joss Knight
Joss Knight el 24 de Nov. de 2018
Okay, I'm getting confused now, we seem to be talking about Optimization not just Curve Fitting. I can see that in the case of lsqcurvefit it's the same thing.
Not everything that parallelizes well on multicore parallelizes on the GPU. The objective functions for optimization are often completely general (or user-specified), and typically non-scalar. Optimization often requires repeated independent evaluations of the objective per iteration, and so this can be done on a parallel pool. But usually we're talking dozens to hundreds of evaluations, not the millions of parallel operations that a GPU needs in order to give an advantage. And unlike a worker in a parallel pool, a GPU thread doesn't have its own MATLAB interpreter which it can use to execute any given objective.
Anyway, this is a simplification - obviously where there's need and opportunity we'll try to do more.
Matt J
Matt J el 25 de Nov. de 2018
Editada: Matt J el 26 de Nov. de 2018
Not everything that parallelizes well on multicore parallelizes on the GPU. The objective functions for optimization are often completely general (or user-specified), and typically non-scalar.
That's true, but there are usually opportunities to accelerate operations done inside and in between the user-specified objective function calls with gpuArray variables. The problem is that there are barriers currently to getting full benefit from that. I've tried to expand on this a bit using the code below (an over-simplifed example with a simple linear least squares objective).
The dificulty right now is, if I want to incorporate gpuArray functionality into my objective function (or constraints) in any of the Optimization Toolbox solvers, I have to transfer data to and from the GPU with every call to the objective. It would be better if I could just leave everything on the GPU throughout the optimization. Not only would that spare me the CPU/GPU data transfers, but also the matrix algebra operations done by fmincon with the grad and HessCPU output between calls to objectiveLeastSquares would be done on the GPU and would be faster as well.
A=gpuArray(A);
HessGPU=A.'*A;
y=gpuArray(y);
fmincon(@objectiveLeastSquares(x,A,y), x0, ____);
function [fval,grad,HessCPU] = objectiveLeastSquares(x,A,y,HessGPU)
err=A*x-y; %implicitly sends x to te GPU?
fval=gather( norm(err)^2 );
if nargout>1, grad=gather( A.'*err ); end
if nargout>2, HessCPU = gather(Hess); end
Marek Dostal
Marek Dostal el 26 de Nov. de 2018
Thank you for your answers.
I have MRI data of brain (each subject has approx. 0,5 mil voxels) with 10 different parameters and I want to in each voxel fit this 10 points by specific biexponencial curve (so I receive parametric maps of brain). I tried to optimize the script before I used parallel computing (1 brain ~4 hours) but I used only parfor (~1,25 hours) and because I am still beginner with Matlab I was curious if other users have experience with GPU and can help me make this faster than parfor.
Thanks
Marek
Joss Knight
Joss Knight el 27 de Nov. de 2018
Matt, you make a fair point and I will capture your request in our system.
Joss
Matt J
Matt J el 27 de Nov. de 2018
Thanks, Joss.
Martin Ryba
Martin Ryba el 26 de Ag. de 2022
Movida: Matt J el 26 de Ag. de 2022
Hey Matt, one trick to avoid CPU/GPU transfers for a Jacobian Multiply Function is since these functions are often nested within your main routine, use variables (gpuArrays in your case) that are scoped outside the individual functions. I'm not using a GPU, but I have a very big optimization (~2M measurements and 10k unknowns), and I store some relevant large matrices as external arrays that are updated by the primary function and then used by the JM functions (pos and neg). I don't see why that couldn't work for gpuArrays as well.
Matt J
Matt J el 26 de Ag. de 2022
Editada: Matt J el 26 de Ag. de 2022
@Martin Ryba That is interesting. I can see how that would allow you, when using lsqnonlin, to leave constant data like your measurement array on the GPU. However, there still appear to be problems.
(1) The result of any call to the Jacobian Multiply Function will be an array equal in size to your measurements. There is no way to avoid transfering that result back to the CPU, because that's where lsqnonlin will be looking for it.
(2) It doesn't seem to apply gracefully to lsqcurvefit. When using lsqcurvefit, you want your measurements to reside in the ydata input array, and lsqcurvefit will not allow ydata or xdata to be gpuArrays. Not as big a problem as (1), but still irksome.
Martin Ryba
Martin Ryba el 26 de Ag. de 2022
The output array of the JMfun depends on Y and the flag. If you have N measurements and K model parameters, instead of returning the entire NxK Jacobian you're often returning either something 1xN, 2xN or KxK I think. It's been a while since I last ran mine in the debugger to get it working. Either way, it's often a more limited array than what you may be storing in the GPU.
Matt J
Matt J el 26 de Ag. de 2022
Editada: Matt J el 26 de Ag. de 2022
Sure, I agree transferring an Nx1 vector is certainly better than transferring the complete Jacobian. However, as your case illustrates, N can still be quite large. In my world N=60e6,K=40e6 wouldn't be that unusual.
Also, for people who don't supply an analytical Jacobian, relying instead on finite differences, it's not just a matter of a single transfer of an NxK Jacobian matrix. Each iteration will require K transfers of an Nx1 vector (which I suspect would be a lot worse).
Emiliano Rosso
Emiliano Rosso el 10 de Nov. de 2022
Can we modify Matlab code or it's p-coded?
I mean to follow the routine and eliminate errors , type mismatch , ecc...
Thanks!
Bruno Luong
Bruno Luong el 10 de Nov. de 2022
Editada: Bruno Luong el 10 de Nov. de 2022
Just to chim in the discussion about Matt's request of enable optimization functions on GPU.
I think the feature will be mosy interesting for people working on image (2D-3D) processing: denoising, debluring where the number of parameters are t number of pixels. The cost function are sum of some local operator so very suitable for GPU architecture.
If the transfert data between CPU and GPU is NOT required at each iteration or each gradient computation, it will be benefit for the run time.
There might be an new class of optimization method that is still develipped by researchers, but I think one must think about better exploiting GPU archiecture in TMW optimzation toolbox.

Iniciar sesión para comentar.

Categorías

Más información sobre Quadratic Programming and Cone Programming en Centro de ayuda y File Exchange.

Preguntada:

el 24 de Nov. de 2018

Editada:

el 10 de Nov. de 2022

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by