Redistribution of histogram type data in specified bins

3 visualizaciones (últimos 30 días)
Arnaud Samie
Arnaud Samie el 25 de Sept. de 2020
Respondida: Steven Lord el 25 de Sept. de 2020
I am trying to redistribute data with a step of 4 into a new binning of step 3, as pictured below with input X = [4,16,8] and desired output Y = [3,9,10,6]. What is the most efficient way to do so? Until now I have been using random drawings but it takes too long for my current needs.
  2 comentarios
KALYAN ACHARJYA
KALYAN ACHARJYA el 25 de Sept. de 2020
From X = [4,16,8] to desired output Y = [3,9,10,6], is there any redistibution logic? The third term might be 12??
Arnaud Samie
Arnaud Samie el 25 de Sept. de 2020
Yes, total area must stay the same. sum(X) = sum(Y)

Iniciar sesión para comentar.

Respuestas (2)

Rik
Rik el 25 de Sept. de 2020
You should be really careful with this resampling, especially for so few samples.
Since you're assuming a flat distribution in each bar, why not treat you histogram as a probability distribution function? Then you can use the area under the curve to calculate the new heights.
  2 comentarios
Arnaud Samie
Arnaud Samie el 25 de Sept. de 2020
My "real" data has a lot more bins, and the desired new binning only slightly differs from the original, so it should be quite "safe". And it is indeed a probability distribution, so I will go with your solution. I assume you are thinking about using interp1?
Rik
Rik el 25 de Sept. de 2020
Yes, that is in broad strokes the idea. See the code below for a rough sketch. You can probably do a lot to optimize.
X=[1 4 2];X=X/sum(X);
N=numel(X);
x_center=get_bin_pos(N);
xx=linspace(0,1,1000);
yy=interp1(x_center,X,xx,'nearest','extrap');
figure(1),clf(1)
plot(xx,yy)
tmp=cumtrapz(xx,yy);
N=4;
[x_center,x_right]=get_bin_pos(N);
Y=zeros(1,N);
for n=1:N
x=x_right(n);
Y(n)=tmp(find(xx>=x,1,'first'));
if n>1
Y(n)=Y(n)-sum(Y(1:(n-1)));
end
end
Y=Y/sum(Y);
hold on
plot(x_center,Y,'*')
hold off
yy2=interp1(x_center,Y,xx,'nearest','extrap');
hold on
plot(xx,yy2)
hold off
axis([0 1 0 1])
function [x_center,x_right]=get_bin_pos(N)
x=linspace(0,1,2*N+1);
x_center=x(2:2:end);%use center to interpolate the histogram
x_right=x(3:2:end);%integrate up to right edge to find bin count
end

Iniciar sesión para comentar.


Steven Lord
Steven Lord el 25 de Sept. de 2020
>> x = randi(12, 1, 1000);
>> h = histogram(x, 0:4:12);
% Look at the histogram before running the next line of code
>> h.BinEdges = 0:3:12;

Categorías

Más información sobre Histograms en Help Center y File Exchange.

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by