MATLAB Answers

How to quickly find the column index of the last non-zero element in all rows in a sparse matrix?

41 views (last 30 days)
Benson Gou
Benson Gou on 19 Mar 2020
Edited: Ameer Hamza on 22 Mar 2020
Hi, All,
I have a big sparse matrix A. I want to find out the column index of the Last non-zero element in all rows from the end. Here is my code:
reorderRow = [];
for jRow = 1 : length(A(:,1))
eee = find(A(jRow,:),1,'last');
reorderRow = [reorderRow eee];
end
For example, I have matrix A = [2 0 0;0 5 0;0 0 1;0 3 0;-1 0 -4]. My code gives the following result:
reorderRow = [1 2 3 2 3];
I am wondering if it is possible to obtain reorderRow without iterations. Thanks a lot.
Benson

  0 Comments

Sign in to comment.

Accepted Answer

BobH
BobH on 19 Mar 2020
Edited: BobH on 19 Mar 2020
reorderRow = arrayfun(@(X) find(A(X,:),1,'last'), 1:size(A,1))
reorderRow =
1 2 3 2 3

  4 Comments

Show 1 older comment

Sign in to comment.

More Answers (4)

John D'Errico
John D'Errico on 19 Mar 2020
Edited: John D'Errico on 19 Mar 2020
It is important that if your matrix is sparse, that we use sparse matrix examples to test anything out.
Regardless, I would always do a comparison of other methods to a direct find. For example...
A = sprand(10000,10000,.0005);
So, on average, 5 elements per row. If some rows have no non-zeros, then this will return zero as the index of the last element.
tic
[ir,ic] = find(A);
maxcolind = accumarray(ir,ic,[10000,1],@max);
toc
Elapsed time is 0.026327 seconds.
This will not fail, even if some rows are empty of non-zeros.
sum(maxcolind == 0)
ans =
59
The random example I created had 59 rows that were entirely zero.
If you want to find the array reorderRow as you asked for, then just do this:
r = find(maxcolind);
reorderRow = [r,maxcolind(r)];

  1 Comment

Sign in to comment.


James Tursa
James Tursa on 19 Mar 2020
Edited: James Tursa on 19 Mar 2020
You can use a mex routine for this and avoid all temporary data copies. E.g., a direct approach:
/* File: find_last_nz.c
*
* C = find_last_nz(S)
*
* Finds column number of last non-zero element in each row of sparse matrix
* S = sparse matrix (double or logical)
* C = full column vector containing the columns numbers for each row
*
* Programmer: James Tursa
*/
/* Includes ----------------------------------------------------------- */
#include "mex.h"
/* Gateway ------------------------------------------------------------ */
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
double *Cpr;
mwIndex *Ir, *Jc;
size_t i, j, imax, m, n;
if( nlhs > 1 ) {
mexErrMsgTxt("Too many outputs.");
}
if( nrhs != 1 || !mxIsSparse(prhs[0]) ) {
mexErrMsgTxt("Need exactly one sparse input.");
}
Ir = mxGetIr(prhs[0]);
Jc = mxGetJc(prhs[0]);
m = mxGetM(prhs[0]);
n = mxGetN(prhs[0]);
plhs[0] = mxCreateDoubleMatrix(m,1,mxREAL);
Cpr = (double *) mxGetData(plhs[0]);
for( j=0; j<n; j++ ) {
imax = Jc[j+1] - Jc[j];
for( i=0; i<imax; i++ ) {
Cpr[*Ir++] = j+1;
}
}
}
And a sample run:
>> mex find_last_nz.c
Building with 'Microsoft Windows SDK 7.1 (C)'.
MEX completed successfully.
>> A = rand(10).*(rand(10)<.2)
A =
0.6443 0 0.3111 0 0 0 0.0377 0 0 0
0 0 0 0 0 0 0 0 0 0
0.8116 0 0 0.6028 0.8010 0 0 0.4942 0 0
0.5328 0 0 0 0 0 0 0 0 0
0 0 0.9049 0.2217 0 0 0.0987 0 0 0
0.9390 0 0 0.1174 0 0 0 0.9037 0 0.1679
0 0 0 0 0 0.6791 0 0 0 0
0 0 0 0.3188 0 0.3955 0 0 0 0
0 0.2277 0 0 0 0 0.1366 0 0 0.5005
0.5870 0.4357 0 0 0 0.9880 0 0 0.5767 0.4711
>> find_last_nz(sparse(A))
ans =
7
0
8
1
7
10
6
6
10
10
What is presented above is a one-pass approach that only looks at the internal sparse indexing arrays.
It is possible that it might be faster to transpose the matrix first so that I could pick off the column numbers directly from the internal arrays, but that would require a data copy and I haven't written and tested this approach. That is, picking off the last non-zero row number for each column would be an extremely fast operation in a mex routine, but for your case the matrix would have to be transposed first which would probably dominate the run-time so I haven't bothered to write and test it.
EDIT
A timing comparison with the m-code method posted by John:
>> A = sprand(10000,10000,.001);
>>
>> tic;[ir,ic] = find(A);maxcolind = accumarray(ir,ic,[10000,1],@max);toc
Elapsed time is 0.029285 seconds.
>> tic;[ir,ic] = find(A);maxcolind = accumarray(ir,ic,[10000,1],@max);toc
Elapsed time is 0.030831 seconds.
>> tic;[ir,ic] = find(A);maxcolind = accumarray(ir,ic,[10000,1],@max);toc
Elapsed time is 0.028965 seconds.
>>
>> tic;C=find_last_nz(A);toc
Elapsed time is 0.000260 seconds.
>> tic;C=find_last_nz(A);toc
Elapsed time is 0.000253 seconds.
>> tic;C=find_last_nz(A);toc
Elapsed time is 0.000278 seconds.
>>
>> isequal(maxcolind,C)
ans =
logical
1
So, the mex routine is about 100x faster for this particular example.

  6 Comments

Show 3 older comments
Ameer Hamza
Ameer Hamza on 20 Mar 2020
Benson, you just need to copy the James's code and paste it in a file named find_last_nz.c. Then run
mex find_last_nz.c
After that, you can call this function, like any other MATLAB function
find_last_nz(A)
Matt J
Matt J on 20 Mar 2020
Benson, you just need to copy the Matt's code and paste it in a file named find_last_nz.c. Then run
James's code, I think you mean.

Sign in to comment.


Ameer Hamza
Ameer Hamza on 19 Mar 2020
Edited: Ameer Hamza on 19 Mar 2020
For starters, do preallocation, It can speed up the execution if the matrix A has many rows. The following code compares the execution time for your code, BobH's code, and the code with pre-allocation. I intentionally made the A matrix with 10000 rows to show the difference
A = randi([1 5], 10000, 3);
% your code
tic
reorderRow = [];
for jRow = 1 : length(A(:,1))
eee = find(A(jRow,:),1,'last');
reorderRow = [reorderRow eee];
end
toc
% arrayfun method
tic
reorderRow = arrayfun(@(X) find(A(X,:),1,'last'), 1:length(A));
toc
% pre-allocating the array
tic
reorderRow = zeros(1, size(A,1));
for jRow = 1 : length(A(:,1))
eee = find(A(jRow,:),1,'last');
reorderRow(jRow) = eee;
end
toc
Result
Elapsed time is 0.068404 seconds. % your code (slowest)
Elapsed time is 0.048778 seconds. % array fun (medium)
Elapsed time is 0.015521 seconds. % pre-allocation (fastest)
Pre-allocation is about 3.5 times faster than the original code.

  5 Comments

Show 2 older comments
John D'Errico
John D'Errico on 19 Mar 2020
You also need to watch out for empty rows, as that will cause your code to fail, with this error:
"Unable to perform assignment because the left and right sides have a different number of elements."
And if the matrix is indeed sparse, that is an entirely possible event.

Sign in to comment.


Matt J
Matt J on 19 Mar 2020
Edited: Matt J on 19 Mar 2020
[~,idx]=max(fliplr(logical(A)),[],2);
result=size(A,2)+1-idx;

  6 Comments

Show 3 older comments
Benson Gou
Benson Gou on 20 Mar 2020
Hi, Matt,
I tried to use the transpose of matrix A and apply your code. But after the transposing, we need to find the index of the last non-zero element in each column instead of the first non-zero element. But logical only returns the index of the first non-zero element of each column.
The time does matter because we have to run it for more than 1000 times.
Thanks.
Benson
Matt J
Matt J on 20 Mar 2020
You need to use flipud() so that "first" becomes "last", similar to how fliplr() did this in my solution above for row-wise search.

Sign in to comment.

Sign in to answer this question.


Translated by