How to plot bivariate cumulative probability centred to a point?

4 visualizaciones (últimos 30 días)
Benoit Espinola
Benoit Espinola el 7 de Ag. de 2018
Comentada: Benoit Espinola el 7 de Ag. de 2018
Hi,
I am trying to create a probability contour plot from a scatter plot centred in the areas with the largest density.
Basically going from this:
To this:
Where:
Probability for a point to be in the red area is 68% Probability for a point to be in the yellow area is 95% Probability for a point to be in the green area is 99.7% Probability for a point to be in the blue area is 99.99%
(I am not sure about my colour choices, I should probably have drawn it from blue being 68% to red being 99.99%).
I have tried to use histcounts2 using the following code:
x = randn(1000,1) + randn(1000,1);
y= randn(1000,1) + randn(1000,1);
X = [x,y];
N=histcounts2(x,y);
N1 = histcounts2(x,y,'Normalization','countdensity');
N2 = histcounts2(x,y,'Normalization','cumcount');
N3 = histcounts2(x,y,'Normalization','probability');
N4 = histcounts2(x,y,'Normalization','pdf');
N5 = histcounts2(x,y,'Normalization','cdf');
figure
subplot(3,3,1);
scatter(x,y, '.');
title('Scatter plot of the data');
subplot(3,3,2); %to visualize the density for each bin
h = histogram2(x,y,'DisplayStyle','tile','ShowEmptyBins','on');
title('Density plot from the data - using histogram2');
subplot(3,3,3);
contour(N);
colorbar
title('contour plot - using histcounts2 - default normalization');
subplot(3,3,4);
contour(N1);
colorbar
title('contour plot - using histcounts2 - countdensity normalization');
subplot(3,3,5);
contour(N2);
colorbar
title('contour plot - using histcounts2 - cumcount normalization');
subplot(3,3,6);
contour(N3);
colorbar
title('contour plot - using histcounts2 - probability normalization');
subplot(3,3,7);
contour(N4);
colorbar
title('contour plot - using histcounts2 - pdf normalization');
subplot(3,3,8);
contour(N5);
colorbar
title('contour plot - using histcounts2 - cdf normalization');
I get the following outcome from running this code (note that outcomes might vary due to the random nature of the first two lines of code):
N =
0 0 0 0 0 1 0 0 0 0
0 0 0 1 0 1 0 0 0 0
0 0 1 7 2 3 1 1 0 0
0 1 1 15 15 14 12 0 0 0
2 3 7 31 43 40 32 10 1 0
0 2 22 40 65 64 46 22 4 1
0 3 22 42 71 61 38 8 5 1
0 3 7 21 46 51 25 11 1 1
0 0 3 6 15 11 10 6 0 0
0 0 2 2 5 1 4 2 0 0
0 0 0 1 2 1 0 1 0 0
The result is as expected, as the 'cumcount' or 'cdf' starts its cumulative count from the origin of the matrix (cf. Matlab documentation - histcounts2 -> Output arguments -> N> ).
Is there any way to 'cumcount' or 'cdf' centered in the most dense area so that I have an output similar to the one hand drawn? Basically, I would like to centre it to (5,7) for N, as it is its largest value (71).
Thank you in advance,
  2 comentarios
Benoit Espinola
Benoit Espinola el 7 de Ag. de 2018
Would it make sense if I made:
x = randn(1000,1) + randn(1000,1);
y= randn(1000,1) + randn(1000,1);
X = [x,y];
N = 1-histcounts2(x,y,'Normalization','pdf');
figure();
contour(N);
colorbar
?
I get this:
Benoit Espinola
Benoit Espinola el 7 de Ag. de 2018
Running the last comment several times, I get this figure:
implying I have two areas with 93% probability... I would think that the two areas combined have a 93% probability to find a point of the scatter plot (and not each area individually).
Am I wrong?

Iniciar sesión para comentar.

Respuestas (0)

Categorías

Más información sobre Data Distribution Plots en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by