How to randomly select the datapoints in a vector based on percentage for each group?
5 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Josephine Bernadette
el 15 de Dic. de 2024
Respondida: Walter Roberson
el 16 de Dic. de 2024
Divide the data values into groups using the percent distributions of the data values: Group 1=25%, Group 2=30%, Group 3=20%, Group 4=25%.
0 comentarios
Respuestas (3)
DGM
el 15 de Dic. de 2024
There has to be a smarter way than this, but I guess this is one idea.
% some fake data
x = randn(1000,1);
% the percentiles (should be a unit sum)
prct = cumsum([0.25 0.30 0.20 0.25])
% where they lie in data units
pval = prctile(x,prct*100)
% bin the data
nbins = numel(prct);
xbinned = cell(nbins,1);
for k = 1:nbins
switch k
case 1
mask = x <= pval(k);
case nbins
mask = x > pval(k-1);
otherwise
mask = x > pval(k-1) & x <= pval(k);
end
xbinned{k} = x(mask);
end
xbinned
... that's assuming I understand the question correctly.
1 comentario
DGM
el 16 de Dic. de 2024
... I just realized the question said "randomly", so I probably completely misinterpreted the question.
Star Strider
el 15 de Dic. de 2024
I’m not certain iif you want to apportion them as they exist in the original vector, or if you want to apportion them by ascending value (essentially their percentile ranks).
Here are two methods of apportioning them —
x = randn(153,1);
L = numel(x);
gv = [25 30 20 25];
g1 = round(gv*L/100);
xg1 = mat2cell(x, g1, size(x,2)) % Option #1: Apportion Without Sorting
[xs, sidx] = sort(x); % Sort Ascending
xg2idx = mat2cell(sidx, g1, size(x,2)) % Collect Sort Indices
xg2 = cellfun(@(g)x(g), xg2idx, 'Unif',0) % Option #2: ‘x’ Apportioned By ‘sort’ Indices
The first ooption just apportionns them as they exist in the original vector. Tthe second apportions them essentially by their percentile ranks in the vector by first apportioning the indices produced by the sort function.
I tried this with different lengths for ‘x’ and it appears to be robust. Obviiously there is a lower limit to the number of elements in ‘x’ that would probably crash it, however I didn’t do that experiment.
.
0 comentarios
Walter Roberson
el 16 de Dic. de 2024
n = 4;
k = number of samples to generate
w = [0.25, 0.30, 0.20, 0.25];
y = randsample(n,k,true,w)
0 comentarios
Ver también
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!