How to calculate the conditional probability of an event?
    10 visualizaciones (últimos 30 días)
  
       Mostrar comentarios más antiguos
    
    Myriam Moss
 el 23 de Abr. de 2021
  
    
    
    
    
    Respondida: William
      
 el 25 de Abr. de 2021
            I have an array similar to this array = [A A B A C A B B B C C A A C]. I want to calculate p(C|A), p(C|B), p(C|C). How can I do this just having this information? I want to know what is the probability of C happening after a previous event A, B or C.
0 comentarios
Respuesta aceptada
  William
      
 el 23 de Abr. de 2021
        Hello Myriam.   It is not clear whether A, B, and C here are text characters, or if they have numeric values.  If we assume they have numerical values, like A=1, B=2 and C=3, then you could use
y = diff(array);
P_AC = sum(y==2);
P_BC = sum(y==1);
P_CC = sum(y==0);
1 comentario
Más respuestas (2)
  William
      
 el 25 de Abr. de 2021
        Actually, I believe that p(C|A) would be:
y = strfind(array, 'A');
N_A = length(y);
p_CA = N_AC/N_A;
There is one further thing to consider, though.  It may be true that the very last element of array is an 'A'.  I don't think this should be counted in N_A because we don't know whether it would have been followed by a 'C' or not.  So, if the last  element of array is 'A', we should reduce N_A by 1.
y = strfind(array, 'A');
N_A = length(y);
if y(end)==length(array) || y(end)==length(array)-1   % The string might end
    N_A = N_A - 1;                                    % with an 'A' or an 'A '
end
p_CA = N_AC/N_A;
0 comentarios
  William
      
 el 25 de Abr. de 2021
        Myriam,
If A, B and C were variables with the values 1, 2 and 3, then in your example:
array = [1, 1, 2, 1, 3, 1, 2, 2, 2, 3, 3, 1, 1, 3]
The diff() function returns the difference between each value and the next value, so
diff(array) = [0, 1, -2, 2, -2, 1, 0, 0, 1, 0, -2, 0, 2]
Every time an A is followed by a C, the difference is 2.  Every time a B is followed by a C, the difference is 1.  So, I was suggesting that you count the number of times A is followed by C by counting the number of times that the value 2 appears in diff(array) with a statement like  c = sum(diff(array) == 2).   Unfortunately, I see now that this does not work correctly for the number of times B is followed by C, because this results in a value of 1 in diff(array), and a value of 1 is also produced when an A is followed by a B.
Since you have said that A, B and C are characters, I assume that you mean that:
array = 'A A B A C A B B B C C A A C';
In this case, maybe a better solution would be:
    y = strfind(array, 'A C');
    N_AC = length(y);
    y = strfind(array, 'B C');
    N_BC = length(y);
Ver también
Categorías
				Más información sobre Dimensionality Reduction and Feature Extraction en Help Center y File Exchange.
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!

