How can I extract a certain 'cluster' of elements according to a particular condition on the elements?

6 visualizaciones (últimos 30 días)
I have a matrix (about 342 by 342) denoted by C(k,l) and I want to identify all cluster of indices of the original according to the condition C(k,l) > rho. I.e. I want all square matrices C'(a,b) of C(k,l) such that C'(a,b) > rho for all pairs of indices a and b
For example, if I have the matrix C(i,j) as:
C = 1 0.8 0.7
0.8 1 0.5
0.7 0.5 1
And rho = 0.6 then a correct square matrix I want my code to identify is:
C'= 1 0.7
0.7 1
This is not unique of course and the result as given by the example above is not necessarily a submatrix. I am not sure how/the best way to do this is in MATLAB? If possible, I would also like identify what a and b are for each possible matrix e.g. for my example above a and b can be 1 or 3. The matrices are always symmetric and the diagonal entries are always 1.
  8 comentarios
Ansh
Ansh el 22 de En. de 2016
Editada: Ansh el 22 de En. de 2016
Kirby Fears,
Yes each submatrix is required to have 1's on the diagonal. What do you mean exactly by taking submatrices along the diagonal?
Many thanks
Ansh
Ansh el 22 de En. de 2016
Image Analyst,
The matrices being considered are correlation matrices. I am using a clustering procedure to find a cluster of indices (in this case indices are stocks) that are highly correlated to each other. This involves extracting a square submatrix from the original matrix such that of all of its entries are >= rho as described in the problem. In the actual correlation matrices to be used (which are taken from empirical data) it is possible that this may not give a unique cluster (possibly not given the size), hence why I have asked for all such submatrices. Does this help in anyway?

Iniciar sesión para comentar.

Respuestas (2)

Kirby Fears
Kirby Fears el 21 de En. de 2016
Editada: Kirby Fears el 21 de En. de 2016
Assuming you only want to find submatrices along the diagonal of C, the following code extracts all square submatrices (>rho) into a table S. This should be a good starting point for whatever assumptions you end up deciding on.
% make data
sizeC = 342;
rho = 0.6;
c = rand(sizeC);
c(1:(sizeC+1):end) = 1;
% prep
S = cell((sizeC-2)*(sizeC-1),3);
varNames = {'S','sizeS','diagC'};
idxRho = c>rho;
counterS = 1;
% traverse submatrix size
for sizeS = (sizeC-1):-1:2,
% traverse diagonal of c
for d = 1:(sizeC-sizeS),
% store valid submatrix with meta info
if all(idxRho(d:(d+sizeS-1),d:(d+sizeS-1))),
S(counterS,:) = {c(d:(d+sizeS-1),d:(d+sizeS-1)),...
sizeS,d};
counterS = counterS + 1;
end
end
end
% drop extra rows of S
if counterS<=size(S,1),
S(counterS:end,:)=[];
end
% convert S to table
S = array2table(S,'VariableNames',varNames);
Hope this helps.
  9 comentarios
Ansh
Ansh el 27 de En. de 2016
Editada: Ansh el 27 de En. de 2016
Hi Stephen, but this doesn't produce the 3 by 3 cluster I have detailed in the comment to your code. (There could have been a delay in posting that further comment - sorry).
Ansh
Ansh el 27 de En. de 2016
Editada: Ansh el 27 de En. de 2016
Hi Kirby Fears,
I was attempting to use the sets of indices to clarify what it is I wanted, obviously it seems as if it has had the opposite effect. Thank you for taking the time to answer anyway. In hindsight, this question could have been posed better from the start to avoid the confusion caused later. Many thanks for the best wishes.

Iniciar sesión para comentar.


Stephen23
Stephen23 el 23 de En. de 2016
Editada: Stephen23 el 24 de En. de 2016
Assuming that the input matrix is always square and symmetric:
>> D = [1,0.8,0.9,0.5;0.8,1,0.6,0.1;0.9,0.6,1,0.7;0.5,0.1,0.7,1]
D =
1 0.8 0.9 0.5
0.8 1 0.6 0.1
0.9 0.6 1 0.7
0.5 0.1 0.7 1
>> rho = 0.6;
>> [R,C] = find(tril(D,-1)>rho);
>> out = arrayfun(@(r,c)D([r,c],[r,c]),R,C,'UniformOutput',false);
>> out{:}
ans =
1 0.8
0.8 1
ans =
1 0.9
0.9 1
ans =
1 0.7
0.7 1
  5 comentarios
Ansh
Ansh el 28 de En. de 2016
Editada: Ansh el 28 de En. de 2016
Thank you Stephen for your answer, it is partly that reason why I first posted on here. I shall go away and think of an alternative procedure. If I can analyse the data I have and come up with an upper bound on the size of the cluster that could help.

Iniciar sesión para comentar.

Categorías

Más información sobre Descriptive Statistics en Help Center y File Exchange.

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by