Calculate confidence interval of data set

Question

Katara So el 17 de Dic. de 2020

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/696780-calculate-confidence-interval-of-data-set

Editada: Ive J el 20 de Dic. de 2020

I want to calculate a 95% confidence interval for the difference between the mean of each variable for the species versicolor and virginica in the "fisheriris" data set, but I don't quite understand how to write the code. This is my code so far:

load fisheriris
%Versicolor
ver1 = meas(51:100,1);
ver2 = meas(51:100,2);
ver3 = meas(51:100,3);
ver4 = meas(51:100,4);
ver1_mean = mean(ver1);
ver2_mean = mean(ver2);
ver3_mean = mean(ver3);
ver4_mean = mean(ver4);
%Virginica
vir1 = meas(101:150,1);
vir2 = meas(101:150,2);
vir3 = meas(101:150,3);
vir4 = meas(101:150,4);
vir1_mean = mean(vir1);
vir2_mean = mean(vir2);
vir3_mean = mean(vir3);
vir4_mean = mean(vir4);
q1 = quantile(ver1_mean - vir1_mean, [0.95])
q2 = quantile(ver2_mean - vir2_mean, [0.95])
q3 = quantile(ver3_mean - vir3_mean, [0.95])
q4 = quantile(ver4_mean - vir4_mean, [0.95])

But I assume that I should have some sort of interval instead of just 0.95 in my quantile command, however whatever I put for ex, [0.05 0.95], I get the same value in for both ends of the interval and if I just have 0.95 then I don't get an interval. Could someone explain where I have gone wrong, all help is appreciated!

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Ive J el 18 de Dic. de 2020

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/696780-calculate-confidence-interval-of-data-set#answer_579000

Editada: Ive J el 18 de Dic. de 2020

Abrir en MATLAB Online

I'm not sure if you want to calculate CI or something like reference range. Regardless, as you can see on quantile help page, it says Quantiles of a data set. So, of course when your inputs are scalar the output is same as the input.

data = randn(100, 1);
quantile(data, [0.05 0.95]) % this is not 95% CI
% 95% CI
CI95 = [mean(A) - 1.96*(std(A)./sqrt(numel(A))), mean(A) + 1.96*(std(A)./sqrt(numel(A)))]

2 comentarios
Mostrar NingunoOcultar Ninguno

Katara So el 18 de Dic. de 2020

Abrir en MATLAB Online

I want to calculate CI, however I am having trouble getting the mean values of the variables. In another problem where I had to use bootstrp to calculate CI, I did like this:

%Versicolor
ver1 = meas(51:100,1);
%Virginica
vir1 = meas(101:150,1);
M = 2000;
alpha = 0.05/4;
vers_boot1=bootstrp(M,@mean,ver1);
vir_boot1=bootstrp(M,@mean,vir1);
ci1 = quantile(vers_boot1-vir_boot1, [alpha 1-alpha])

which gave me an that ci1 = [-0.9170 -0.3960].

So now I want to do the same but without using bootstrp and for 95% CI. What I don't understand is how to get a vector of the mean values for the two different variables. With the code in my original question I get that the mean is only one value while in the problem posted above I get it to be a vector.

Ive J el 20 de Dic. de 2020

Editada: Ive J el 20 de Dic. de 2020

Abrir en MATLAB Online

In bootsrapt you simply do random sampling form ver1 and vir1; so, in above example you pick K random subsamples with replication from ver1 and vir1 and calculate the mean of each subsample. So, you have a vector of mean values, for which you can easily calculate CI.

I guess you are looking for something like this:

load fisheriris
%Versicolor
ver = meas(51:100,1);
%Virginica
vir = meas(101:150,1);
% 95% CI using bootstrapping
M = 2000;
alpha = 0.05/2;
ver_boot = bootstrp(M, @mean, ver);
vir_boot = bootstrp(M, @mean, vir);
ci_bootstrap = quantile(ver_boot - vir_boot, [alpha 1-alpha]);
% using Z-value (assuming normal distribution)
A = ver - vir;
ci_z = [mean(A) - 1.96*(std(A)./sqrt(numel(A))), ...
    mean(A) + 1.96*(std(A)./sqrt(numel(A)))];
% using t-distribution 
[~, ~, ci_t] = ttest2(ver, vir);
% compare them
ci_bootstrap
ci_z
ci_t'
ci_bootstrap =
   -0.8880   -0.4230
ci_z =
   -0.8942   -0.4098
ci_t =
   -0.8819   -0.4221
% visualize esitmated mean from bootstrapping
histogram(ver_boot - vir_boot)
line([mean(A), mean(A)], get(gca, 'YLim'), 'color', 'r', 'LineWidth', 1.5)