33 views (last 30 days)

Show older comments

Hi, I am dealing with "repeated COO format" sparse matrix. I try to "accumulate" it by its index efficiently. I am inspired by the similar problem

COO is one format of sparse Matrix. In Matlab, I can tranfer matrix A to COO format by "find" function. Here, I suppose the matrix A's shape is square.

[ii,jj,val] = find(A);

Acoo.coo = [ii,jj,val];

Acoo.N = size(A,1); % I suppose the matrix A's shape is square.

But my COO format matrix is ill-conditioned. It has many repeated values on each index. I need to sum them by the index(row and column). Here is an mini example.

Acoo.N = 3;

Acoo.coo = [1,1,2;

1,2,3; % repeated

2,1,4;

1,2,5; % repeated

3,1,6;

3,3,2];

A = full(sparse(Acoo.coo(:,1),Acoo.coo(:,2),Acoo.coo(:,3),N,N));

% A =

% 2 8 0

% 4 0 0

% 6 0 2

My matrix A's size is 780000*780000. The nnz of it is nearlly 10,000,000. The size of reapeated COO matrix's size is about 60,000,000*3. It seem each index has 6 repeated elements to accumulate in average.

I had planned two ways to "accumulate" this COO matrix A. One is using "sparse" function. It may use "accumarray" function inside. The other is writing a c++ mex with unordered Hash map. I suppose the mex would be faster. This idea came from a comment of similar Matlab problem. But in fact "sparse" method is 2 times faster than the "mex" method. This makes me very confused. Could I do something more to speed up? Why "sparse" method is so fast?

The method 1. using "sparse"

function A1 = acumCOO(A)

% method 1 using sparse

As = sparse(A.coo(:,1),A.coo(:,2),A.coo(:,3),A.N,A.N);

[ii,jj,val] = find(As);

A1.N = A.N;

A1.coo = [ii,jj,val];

end

% time profile: 9.107291s

The method2. using c++ unordered hash map.

The matlab wrapper

function A2 = acumCOOMex(A)

% c++ method wrapper

N = A.N;

index = sub2ind([N,N],A.coo(:,1),A.coo(:,2));

[key2,val2] = acumMex(index,A.coo(:,3));

[ii,jj] = ind2sub([N,N],key2);

A2.N = N;

A2.coo = [ii,jj,val2];

end

The c++ mex "acumMex.cpp"

// MATLAB related

#include "mex.h"

// c++ related

#include <unordered_map>

#include <time.h>

// Input Arguments

#define KEY1 prhs[0]

#define VAL1 prhs[1]

// Output Arguments

#define KEY2 plhs[0]

#define VAL2 plhs[1]

using namespace std;

void mexFunction(int nlhs, mxArray * plhs[], int nrhs, const mxArray * prhs[])

{

clock_t t0,t1,t2;

t0 = clock();

size_t i;

size_t N1 = mxGetM(KEY1);

mxDouble *key = mxGetPr(KEY1);

mxDouble *val = mxGetPr(VAL1);

unordered_map<size_t,double> coo;

for (i = 0;i< N1 ;i++)

{

coo[key[i]] = coo[key[i]]+val[i];

}

t1 = clock() - t0;

mexPrintf("Map session takes %d clicks (%f seconds).\n",t1,((float)t1)/CLOCKS_PER_SEC);

mexEvalString("drawnow") ;

// output

size_t N2 = coo.size();

KEY2 = mxCreateDoubleMatrix(N2,1,mxREAL);

VAL2 = mxCreateDoubleMatrix(N2,1,mxREAL);

mxDouble *key2 = mxGetPr(KEY2);

mxDouble *val2 = mxGetPr(VAL2);

auto coo_it = coo.cbegin();

for (i = 0;i< N2 ;i++)

{

key2[i] = coo_it->first;

val2[i] = coo_it->second;

++coo_it;

}

t2 = clock() - t0;

mexPrintf("whole session takes %d clicks (%f seconds).\n",t2,((float)t2)/CLOCKS_PER_SEC);

mexEvalString("drawnow");

}

%% Map session takes 20583 clicks (20.583000 seconds).

%% whole session takes 21705 clicks (21.705000 seconds).

Bjorn Gustavsson
on 19 Feb 2021

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!