Find row in matrix, fast and with special conditions
2 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
I have a matrix (S) and one row out of that matrix (s). I want to get the rownumber of (s) in (S).
(S) has unique rows. The row (s) is allways member of (S)-rows.
So far I calculate the row number with:
I = find(all(bsxfun(@eq,S,s),2));
I am looking for a faster method. I call this line alot, it costs me hours of calculation time.
Here's a test example:
% create example, N and n could change
n = 4; % n < 10
N = 25; % range(N) ~ [20 100]
S = nchoosek(1:N,n);
for i = 1:10
s = S(randi(size(S,1)),:);
tic
I = find(all(bsxfun(@eq,S,s),2));
toc
end
[Edit2], poorly expressed "(S) has unique rows". It should express that I is allways scalar.
[Edit], dimensional details.
My example shows the general direction of the sizes.
[d1 d2] = size(S); % its [12650x4] in my example
First dimension will be much larger then the second one. (d1 >> d2)
Dimension two (d2=n) will be smaller then 10. Numel(S) will be limited by Memory.
1 comentario
Jan
el 25 de Mayo de 2013
What is the usual dimension of S? It matters if you are talking about 10x4, 10'000x20 or 20x10'000 matrices.
Respuesta aceptada
Jan
el 25 de Mayo de 2013
Editada: Jan
el 25 de Mayo de 2013
If the first dimension of S is large, the BSXFUN approach checks of a lot of numbers although they are rejected in the former columns already. A small Matlab function can avoid testing already rejected rows:
function Index = FindRow(S, s)
nCol = size(S, 2);
match = S(:, 1) == s(1);
for col = 2:nCol
match(match) = S(match, col) == s(col);
end
Index = find(match);
For Image Analyst's example under Matlab 2009a/64/Win7:
S = randi(3, 200, 3);
s = [2 3 1];
tic, for k=1:1000; m = find(all(bsxfun(@eq,S,s),2)); end; toc
tic; for k=1:1000; m = find(ismember(S, s, 'rows')); end; toc
tic; for k=1:1000; m = FindRow(S, s); end; toc
% 0.032688 sec
% 0.433527 sec
% 0.021643 sec
But for S = randi(30, 2000, 20), s=S(1000, :):
% 0.124566 sec
% 3.175940 sec
% 0.194101 sec
[EDITED] It is easier in C to stop the search after the first matching row is found. This saves 50% processing time in the average. In addition no temporary arrays are required:
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
double *S, *s;
mwSize n1, n2, i1, i2, k;
S = mxGetPr(prhs[0]);
n1 = mxGetM(prhs[0]);
n2 = mxGetN(prhs[0]);
s = mxGetPr(prhs[1]);
for (i1 = 0; i1 < n1; i1++) {
k = i1;
for (i2 = 0; i2 < n2; i2++) {
if (S[k] != s[i2]) {
break;
}
k += n1;
}
if (i2 == n2) { // Matching row found:
plhs[0] = mxCreateDoubleScalar((double) (i1 + 1));
return;
}
}
// No success:
plhs[0] = mxCreateDoubleScalar(mxGetNaN());
}
For S = randi(30, 2000, 20) I get (compiled by MSVC2008):
BSXFUN: 0.123305 sec
FindRow.m: 0.194914 sec
ISMEMBER: 3.172425 sec
FindRow.mex: 0.011098 sec (s is last row)
0.007880 sec (s is 1000th row)
0.004753 sec (s is first row)
The MEX has an average speedup of factor 15. For a [20000 x 200] matrix we get a factor of 260 already.
3 comentarios
Jan
el 27 de Dic. de 2013
Dear Jacob L,
The request you've sent me by email would have been a perfect question for this forum. Therefore I answer here, how to modify this function to work for UINT8 arrays:
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
uint8_T *M, *v;
mwSize m, n, i1, i2, k;
if (nrhs != 2 || !mxIsUint8(prhs[0]) || !mxIsUint8(prhs[1])) {
mexErrMsgIdAndTxt("MEX:FindRow:BadInput",
"2 UINT8 arrays required as input.");
}
M = (uint8_T *) mxGetData(prhs[0]);
m = mxGetM(prhs[0]);
n = mxGetN(prhs[0]);
v = (uint8_T *) mxGetData(prhs[1]);
if (mxGetNumberOfElements(prhs[1]) != n) {
mexErrMsgIdAndTxt("MEX:FindRow:BadSize",
"Sizes of inputs do not match.");
}
for (i1 = 0; i1 < m; i1++) {
k = i1;
for (i2 = 0; i2 < n; i2++) {
if (M[k] != v[i2]) {
break;
}
k += m;
}
if (i2 == n) { // Matching row found:
plhs[0] = mxCreateDoubleScalar((double) (i1 + 1));
return;
}
}
// No success:
plhs[0] = mxCreateDoubleMatrix(0, 0, mxREAL);
}
Mark
el 13 de Mzo. de 2015
Hi,
I used your your original mex file in matlab and tried to change it. I'm also searching for fast ways to find a row matrix s in a large matrix S. However the row matrix s appears a number of times in large matrix S.
if true
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[], mwSize, lastfound)
{
double *S, *s *lastfound;
mwSize n1, n2, i1, i2, k;
S = mxGetPr(prhs[0]);
n1 = mxGetM(prhs[0]);
// n2 = mxGetN(prhs[0]);
n2 = 3;
s = mxGetPr(prhs[1]);
for (i1 = lastfound; i1 < n1; i1++) {
k = i1;
for (i2 = 0; i2 < n2; i2++) {
if (S[k] != s[i2]) {
break;
}
k += n1;
}
if (i2 == n2) { // Matching row found:
plhs[0] = mxCreateDoubleScalar((double) (i1 + 1));
return;
}
}
// No success:
plhs[0] = mxCreateDoubleScalar(mxGetNaN());
}
end
So i tried to use another input 'lastfound' which would enable the code to start at lastfound + 1 (the +1 is added when the function is called), so that the previous row s is found, and the code start searching the second row s. I do this until the mex file return NaN. It however doesn't work, the code doesn't start the at the lastfound + 1 step. I am a C noob, so i think it is a very small error, but i would immensely appreciate it if somebody could take a look at it.
Thanks in advance :)
Más respuestas (2)
Image Analyst
el 25 de Mayo de 2013
Try this:
% Generate some sample data:
m = randi(3, 200, 3)
% Specify what we're looking for:
lookingFor = [2 3 1]; % Let's say we're looking for this
% Find out what rows that pattern occurs in:
rowsItOccursIn = find(ismember(m, lookingFor, 'rows'))
5 comentarios
Image Analyst
el 25 de Mayo de 2013
What are the actual sizes of n, N, and S? For 4 and 25, it DEFINITELY should not take "hours of calculation time" like you said.
Teja Muppirala
el 26 de Mayo de 2013
Not sure if this is faster, but might be worth a try.
d1 = 12650;
d2 = 4;
S = randn(d1,d2);
s = S(randi(d1),:);
c = 1;
r = 1;
while c <= d2
if S(r,c) == s(c)
c = c+1;
else
r = r+1;
c = 1;
end
end
isequal(S(r,:),s)
If the big S is the same every time, you could make a better algorithm by sorting it in advance.
1 comentario
Ver también
Categorías
Más información sobre Loops and Conditional Statements en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!