Improve performance of my code
2 visualizaciones (últimos 30 días)
I was working on time warping on sign language and it seems to take about an hour to classify around a 1000 objects. I have heard that it takes 5 mins to run for other friends of mine.
Note that this is a assignement, which I have solved but I need to know the source of my performance hit.
The ASL data set and their format is as below.
object ID: 1121
class label: 8
sign meaning: 4
dominant hand trajectory:
There are multiple such objects and I have read each object into a struct. So, I have a vector of these structs which are created in the format below.
struct('ID', current_id, 'label', current_class_label, 'sign', current_sign, 'data', current_data)
The algorithm for time warping is a dynamic programming solution that calculates the overall cost for the match. The matlab code that corresponds to this algorithm is as below.
function cost = cost_calc(pattern1, pattern2)
new_mat = zeros(size(pattern1,1), size(pattern2,1));
new_mat(1,1) = distance_2d( pattern1(1, :), pattern2(1, :));
for i = 2:size(pattern2,1)
new_mat(1,i) = new_mat(1, i-1) + distance_2d( pattern1(1, :), pattern2(i, :));
for i = 2:size(pattern1,1)
new_mat(i,1) = new_mat(i-1, 1) + distance_2d( pattern1(i, :), pattern2(1, :));
for j = 2:size(pattern2,1)
new_mat(i,j) = min([new_mat(i-1,j), new_mat(i-1, j-1), new_mat(i, j-1)]) + distance_2d( pattern1(i, :), pattern2(j, :));
cost = new_mat(end,end);
Below is the code for Euclidean distance, that I am using.
function d = distance_2d(p1, p2)
d = norm(p1 - p2);
The call the cost_calc function is happening as below. obj being a test object.
distances = arrayfun(@(x) cost_calc(obj.data, x.data), train);
When I run the profiler on my code, I can find that distance_2d which finds the Euclidean distance, seems to take 2-2.5s for every test object, the min call seems to be taking a little above 2s while the overall DTW method seems to take around 4.5s for every test object.
Please note that I cant post the entire code here as it is against the rules of my university.
Any pointers ?
Eamonn el 21 de Abr. de 2017
You can do this in under 0.5 seconds.
Use the UCR suite, google UCR Suite DTW
Greg Dionne el 24 de Abr. de 2017
There is a version of DTW in the Signal Processing Toolbox that can accommodate multivariate data, it also has a version for searching via time-stretching with normalization (FINDSIGNAL).
Since it looks like you're implementing the "classic" unconstrained algorithm, you could first compute a point-wise distance matrix first (call it "D") where D(i,j) = norm(x(:,i)-y(:,j)). Note that I've changed the order in which your data appears to keep elements that you expect to access together in the first index -- MATLAB is column-major in its ordering (not row-major), so you'll see this convention frequently in routines that use MATLAB.
function d = getDistanceMatrix(x, y)
d = zeros(size(x,2),size(y,2));
[u,v] = meshgrid(y(c,:),x(c,:));
d = d + conj(u-v).*(u-v);
d = sqrt(d);
Then you could feed "D" into a routine that computes the cumulative distance matrix, "C". The benefit of splitting out the distance computation from the cumulative computation is now (with a little bit of arithmetic) you can use purely vector operations to do this along each column.
function C = cumulativeDistance(D)
m = size(D,1);
n = size(D,2);
Dcum = cumsum(D);
C(:,1) = Dcum(:,1);
prevmin = [C(1,j-1); min(C(1:end-1,j-1),C(2:end,j-1))];
delta = prevmin - [0; Dcum(1:end-1,j)];
C(:,j) = cummin(delta)+Dcum(:,j);
I think you can take it from there.