Getting a percentile from a histogram

114 visualizaciones (últimos 30 días)
Gurpreet Kaur
Gurpreet Kaur el 9 de Feb. de 2023
Comentada: Jeff Miller el 10 de Feb. de 2023
Let's say I have a large dataset which I have plotted on to a histogram with some given BinWidth. How would one let's say... get the 95th percentile from the data provided only by that histogram, and NOT the large dataset?
The reason I ask only from the data provided provided from the histogram itself is because I wish to test how changing the BinWidth can also change the value of the percentile.
Any help is appreciated!

Respuesta aceptada

Jeff Miller
Jeff Miller el 9 de Feb. de 2023
Usually this is done by linear interpolation within the relevant bin. To stick with your 95th percentile example, suppose you have a bin of scores from 20-30 and you know from the other bins that 90% of scores are less than 20. If there are 8% of scores in the 20-30 bin, then 30 is at the 98th %ile. So, the value at the 95th %ile is estimated to be x95 = 20 + (30-20)*(95-90)/(98-90).
  2 comentarios
Gurpreet Kaur
Gurpreet Kaur el 9 de Feb. de 2023
Hmmmm is it possible you can provide some sort of example code so I can understand a bit better? The "large set of data" can just be generated randomly.
Jeff Miller
Jeff Miller el 10 de Feb. de 2023
samplesize = 1000;
targetpctile = 65;
lsd = randn(samplesize,1); % large set of normal(0,1) data
binedges = -10:0.01:10; % make sure this covers the whole data range
bincounts = histcounts(lsd,binedges);
bincumpcts = 100*cumsum(bincounts)/samplesize - 0.5/samplesize;
relevantbin = find(bincumpcts>=targetpctile,1);
% Note that the relevant bin in binedges goes from binedges(relevantbin) to binedges(relevantbin+1)
startxat = binedges(relevantbin);
binsizex = binedges(relevantbin+1) - startxat;
startpat = bincumpcts(relevantbin-1);
binsizep = bincumpcts(relevantbin) - startpat;
pctileest = startxat + binsizex * (targetpctile - startpat) / binsizep;

Iniciar sesión para comentar.

Más respuestas (1)

Image Analyst
Image Analyst el 10 de Feb. de 2023
Try this
histObj = histogram(data)
Now look at all the properties of the histObject. Also use this function to get the CDF: cumsum
  1 comentario
Jason
Jason el 10 de Feb. de 2023
Hello Professor, I find that you have a deep understanding of MATLAB image processing, and could you please help look at this problem?I'd appreciate it a lot!
https://ww2.mathworks.cn/matlabcentral/answers/1909550-droplet-size-calculation-of-a-spray-image

Iniciar sesión para comentar.

Productos


Versión

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by