Regarding Mutual information calculation of two binary strings

1 visualización (últimos 30 días)
I am trying to get the Mutual information for two binary strings. I have made a code for it:
clear all;
S=textread('ecoli_profiles.txt'); data=unique(S,'rows'); for i=1:length(data) for j=1:length(data) x=data(i,1:70); y=data(j,1:70);
P1=0; Q1=0; R1=0; S1=0;
P=sum(~x & ~y);
if P==0;
P1=1;
else
P1=P;
end
Q=sum(~x & y);
if Q==0;
Q1=1;
else
Q1=Q;
end
R=sum(x & ~y);
if R==0;
R1=1;
else
R1=R;
end
S=sum(x & y);
if S==0;
S1=1;
else
S1=S;
end
J = [P1,Q1;R1,S1]/70.0;
MI(i*j,:) = sum(sum(J.*log2(J./(sum(J,2)*sum(J,1)))));
display(i);
end
end
B = reshape(MI,length(data),length(data));
csvwrite('MI3121.csv',B);
The problem is I am assuming P1, Q1, R1, S1 as 1 if they are coming out to be zero. Code is running fine. But the result is ambiguous. Can any one resolve the problem? I hope new logarithm would help. Can any one help me?

Respuesta aceptada

Alfonso Nieto-Castanon
Alfonso Nieto-Castanon el 16 de Jul. de 2014
When computing mutual information you may assume that 0*log(0) == 0. In your code you could remove all the "if XXX==0" checks, and when computing Mi use instead something like:
Mi(i*j,:) =sum(sum(J.*log2(max(eps,J./(sum(J,2)*sum(J,1))))));
  2 comentarios
Alfonso Nieto-Castanon
Alfonso Nieto-Castanon el 16 de Jul. de 2014
Editada: Alfonso Nieto-Castanon el 16 de Jul. de 2014
while I am at it, you might as well optimize the code a bit and use something like:
S = textread('ecoli_profiles.txt');
data = unique(S,'rows');
data = double(data(:,1:70)); % your binary data
N = size(data,2);
P = (1-data)*(1-data)'/N; % your binary probs (for all data pairs)
Q = (1-data)*data'/N;
R = Q'; % = data*(1-data)'/N;
S = 1-P-Q-R; % = data*data'/N;
H = @(x)x.*log2(max(eps,x)); % entropy Fcn
MI = H(P)+H(Q)+H(R)+H(S) ... % Mutual information matrix
-H(P+Q)-H(R+S)-H(P+R)-H(Q+S);
csvwrite('MI3121.csv',MI);
Alfonso Nieto-Castanon
Alfonso Nieto-Castanon el 16 de Jul. de 2014
Editada: Alfonso Nieto-Castanon el 16 de Jul. de 2014
Yes, the mutual information I(x,x) is equal to the entropy H(x) (which can be any value between 0 and 1, depending on the percentage of 1's and 0's in x)

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Text Analytics Toolbox en Help Center y File Exchange.

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by