Hi, I am dealing with "repeated COO format" sparse matrix. I try to "accumulate" it by its index efficiently. I am inspired by the similar problem COO is one format of sparse Matrix. In Matlab, I can tranfer matrix A to COO format by "find" function. Here, I suppose the matrix A's shape is square.
[ii,jj,val] = find(A);
Acoo.coo = [ii,jj,val];
Acoo.N = size(A,1);
But my COO format matrix is ill-conditioned. It has many repeated values on each index. I need to sum them by the index(row and column). Here is an mini example.
Acoo.N = 3;
Acoo.coo = [1,1,2;
1,2,3;
2,1,4;
1,2,5;
3,1,6;
3,3,2];
A = full(sparse(Acoo.coo(:,1),Acoo.coo(:,2),Acoo.coo(:,3),N,N));
My matrix A's size is 780000*780000. The nnz of it is nearlly 10,000,000. The size of reapeated COO matrix's size is about 60,000,000*3. It seem each index has 6 repeated elements to accumulate in average.
I had planned two ways to "accumulate" this COO matrix A. One is using "sparse" function. It may use "accumarray" function inside. The other is writing a c++ mex with unordered Hash map. I suppose the mex would be faster. This idea came from a comment of similar Matlab problem. But in fact "sparse" method is 2 times faster than the "mex" method. This makes me very confused. Could I do something more to speed up? Why "sparse" method is so fast? The method 1. using "sparse"
function A1 = acumCOO(A)
As = sparse(A.coo(:,1),A.coo(:,2),A.coo(:,3),A.N,A.N);
[ii,jj,val] = find(As);
A1.N = A.N;
A1.coo = [ii,jj,val];
end
The method2. using c++ unordered hash map.
The matlab wrapper
function A2 = acumCOOMex(A)
N = A.N;
index = sub2ind([N,N],A.coo(:,1),A.coo(:,2));
[key2,val2] = acumMex(index,A.coo(:,3));
[ii,jj] = ind2sub([N,N],key2);
A2.N = N;
A2.coo = [ii,jj,val2];
end
The c++ mex "acumMex.cpp"
// MATLAB related
#include "mex.h"
// c++ related
#include <unordered_map>
#include <time.h>
// Input Arguments
#define KEY1 prhs[0]
#define VAL1 prhs[1]
// Output Arguments
#define KEY2 plhs[0]
#define VAL2 plhs[1]
using namespace std;
void mexFunction(int nlhs, mxArray * plhs[], int nrhs, const mxArray * prhs[])
{
clock_t t0,t1,t2;
t0 = clock();
size_t i;
size_t N1 = mxGetM(KEY1);
mxDouble *key = mxGetPr(KEY1);
mxDouble *val = mxGetPr(VAL1);
unordered_map<size_t,double> coo;
for (i = 0;i< N1 ;i++)
{
coo[key[i]] = coo[key[i]]+val[i];
}
t1 = clock() - t0;
mexPrintf("Map session takes %d clicks (%f seconds).\n",t1,((float)t1)/CLOCKS_PER_SEC);
mexEvalString("drawnow") ;
// output
size_t N2 = coo.size();
KEY2 = mxCreateDoubleMatrix(N2,1,mxREAL);
VAL2 = mxCreateDoubleMatrix(N2,1,mxREAL);
mxDouble *key2 = mxGetPr(KEY2);
mxDouble *val2 = mxGetPr(VAL2);
auto coo_it = coo.cbegin();
for (i = 0;i< N2 ;i++)
{
key2[i] = coo_it->first;
val2[i] = coo_it->second;
++coo_it;
}
t2 = clock() - t0;
mexPrintf("whole session takes %d clicks (%f seconds).\n",t2,((float)t2)/CLOCKS_PER_SEC);
mexEvalString("drawnow");
}