Vectorizing the Spline Function
3 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
I'm trying to make my code more efficient by vectorizing, since I'm running on multiple variables on a 2M+ dataset. For the life of me, I can't figure out how to vectorizie a Matlab spline function. The inputs are vectors, and the output is a structure. I can't figure out how to make it work...
%This function is intended to take a variable from the dataset I'm working with
function [output] = Splines(Var,GroupLineNum)
output = array2table(zeros(height(Var),6));
output.Properties.VariableNames = {'spx' 'pcx' 'mkx' 'spxx' 'pcxx' 'mkxx'};
x = array2table(zeros(height(Var),3));
x.Properties.VariableNames = {'x1' 'x2' 'x3'};
y = array2table(zeros(height(Var),3));
y.Properties.VariableNames = {'y1' 'y2' 'y3'};
dummy_table = table(0);
dummy_table2 = [dummy_table;dummy_table];
Variable = array2table(Var);
a_sym=3:height(Var);
a_var=1:height(Var)-2;
b_sym=2:height(Var);
b_var=1:height(Var)-1;
c=1:height(Var);
if GroupLineNum{c,:}<3
output{c,'spx'} = missing;
output{c,'pcx'} = missing;
output{c,'mkx'} = missing;
output{c,'spxx'} = missing;
output{c,'pcxx'} = missing;
output{c,'mkxx'} = missing;
end
%Spline will have three points, x just set as 1/2/3
x(:,'x1') = {1};
x(:,'x2') = {2};
x(:,'x3') = {3};
%The Variables are from a time series, so I'm trying to grab the t-2, t-1, t observations to go in the y vector. It needed to be the same size as the total variable dataset, so I added dummy rows to the beginning.
y(:,'y1') = [dummy_table2;array2table( Variable{a_var,'Var'}.*(GroupLineNum{a_sym,:}>=3)+0.*(GroupLineNum{a_sym,:}<3))];
y(:,'y2') = [dummy_table;array2table(Variable{b_var,'Var'}.*(GroupLineNum{b_sym,:}>=3)+0.*(GroupLineNum{b_sym,:}<3))];
y(:,'y3') = array2table( Variable{c,'Var'}.*(GroupLineNum{c,:}>=3)+0.*(GroupLineNum{c,:}<3));
sp(z,:) = struct2table(spline(z,y{c,:})); % This is where I'm stuck! I can't get this to vectorize. The rest of the code is similar, and basically take the first/second derivative of the spline/pchip/makima fitted lines, then puts it all back together. If I can figure this bit out, hopefully I can finish vectorizing the rest of the code.
pc = pchip(x,y);
mk = makima(x,y);
spx = fnder(sp,1);
pcx = fnder(pc,1);
mkx = fnder(mk,1);
spxx = fnder(sp,2);
pcxx = fnder(pc,2);
mkxx = fnder(mk,2);
output{c,'spx'} = ppval(spx,3);
output{c,'pcx'} = ppval(pcx,3);
output{c,'mkx'} = ppval(mkx,3);
output{c,'spxx'} = ppval(spxx,3);
output{c,'pcxx'} = ppval(pcxx,3);
output{c,'mkxx'} = ppval(mkxx,3);
end
0 comentarios
Respuestas (1)
Vinayak
el 14 de Mayo de 2024
Hi Hayley,
Vectorizing spline functions, especially under various conditions, might not always be straightforward. It might be worth exploring alternative approaches for optimization or vectorization in your case.
For instance, if we maintain the loop “for c = 1:height(Var)”, we can optimize by setting “xValues” statically as “[1,2,3]” since they don't change. This not only saves memory but also simplifies the calculation of “yValues” for cases where “GroupLineNum{c} >= 3”.
When it comes to calculating splines, doing so conditionally—only when necessary—and assigning missing values otherwise can streamline the process. Additionally, I noticed the use of "dummy_tables" for padding; consider adding them back only if they serve a critical purpose outside the demonstrated scope.
output = array2table(zeros(height(Var), 6), 'VariableNames', {'spx', 'pcx', 'mkx', 'spxx', 'pcxx', 'mkxx'});
% Constants for x values
xValues = [1, 2, 3];
% Loop through each set of points
for c = 1:height(Var)
if GroupLineNum{c} < 3
% Assign missing values if condition is met
output{c, :} = missing;
else
% Extract y values for current set, considering conditions
yValues = Var(max(1, c-2):c) .* (GroupLineNum{max(1, c-2):c} >= 3);
% Skips rows when previous 2 data points doesn’t exist
if numel(yValues) < 3
continue;
end
% Spline and its derivatives
sp = spline(xValues, yValues);
pc = pchip(xValues, yValues);
mk = makima(xValues, yValues);
spx = fnder(sp, 1);
pcx = fnder(pc, 1);
mkx = fnder(mk, 1);
spxx = fnder(sp, 2);
pcxx = fnder(pc, 2);
mkxx = fnder(mk, 2);
% Evaluate derivatives at x = 3
output{c, 'spx'} = ppval(spx, 3);
output{c, 'pcx'} = ppval(pcx, 3);
output{c, 'mkx'} = ppval(mkx, 3);
output{c, 'spxx'} = ppval(spxx, 3);
output{c, 'pcxx'} = ppval(pcxx, 3);
output{c, 'mkxx'} = ppval(mkxx, 3);
end
end
This approach should offer a more streamlined and efficient way to handle your data, especially considering the volume you mentioned (2M+ data points). If performance is still a concern, leveraging “parfor” for parallel execution might further enhance the process.
I hope this helps!
0 comentarios
Ver también
Categorías
Más información sobre Splines en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!