ttest NaN result is NaN and i am a beginner

Question

Ashraf Abdo el 27 de Feb. de 2017

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/327149-ttest-nan-result-is-nan-and-i-am-a-beginner

Editada: Adam Danz el 16 de Feb. de 2021

Respuesta aceptada: Adam Danz

Hi all,

I am sorry if the question is too beginner, i am trying to run ttest, but the h and p results are NaN

i do not know what is the problem i would really appreciate it, if anyone gives a hint thanks in advance Best, Ashraf

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Adam Danz el 10 de Dic. de 2019

3
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/327149-ttest-nan-result-is-nan-and-i-am-a-beginner#answer_405812

Editada: Adam Danz el 16 de Feb. de 2021

Abrir en MATLAB Online

Since this topic continues to surface within the forum, here's a summary of reasons why NaN values may appear in a ttest() output and how to diagnose your unexpected results.

Quick review of ttest()

For the test h = ttest(x,y), the null hypothesis is that the distribution formed by x-y is normal with a mean of 0. In fact, you'll get identical results between the following two tests on the same inputs.

x = randi(100,20,1); 
y = randi(100,20,1); 
[h1, p1, ci1, stats1] = ttest(x, y);      
[h2, p2, ci2, stats2] = ttest(x-y, []);   
% test for equality
all([isequal(h1,h2), isequal(p1,p2), isequal(ci1,ci2), isequal(stats1,stats2)])
% ans =
%   logical
%    1

For the test h = ttest(x,m), the null hypothesis is that the distribution in x is normal with mean of m. When m is missing, ttest(x), the default value of m is 0.

Why are NaN values returned when x and y are identical?

% Demo
x = randi(4,20,1);
y = x; 
[h,p,ci,stats] = ttest(x,y)  % h = NaN;  p = NaN

The h and p outputs of [h,p] = ttest(x,y) use the t-value (tval) to determine significance. The tval is computed as

where m is the sample mean, mu is the test-mean (input #2 in a 1-sample ttest), s is the sample standard deviation, and n is the sample size. When x and y are identical, y-x is all zeros and the mean (m) is therefore 0. Since mu is also 0 by default and m-mu is 0 and the standard deviation of a vector of constants is 0, the tval (t) is 0/0 which results in NaN. The NaN value 'infects' all other statistics that follow.

Why are NaN values returned when all x values are identical and equal to the test-mean?

% Demo
h = ttest(zeros(1,10))     % Nan
h = ttest(ones(1,10))      % OK
h = ttest(ones(1,10),1)    % Nan

In h = ttest(x,m), with only one input the test-mean (m) is 0. When x contains a vector of identical values that are equal to m, x-m will result in a vector of 0s which will result in the same problems explained in the section above. The tval will be a NaN and will produce NaN values in the first two outputs..

Why are NaN values returned when the inputs contain only one non-nan value?

% Demo
[h,p,ci,stats] = ttest([nan;nan;5],[nan;nan;6])

With only 1 paired sample, the standard deviation of the population will be 0 since it's only based on a single scalar value. That will result in a 0 in the denominator of the t-value formula (shown above) which will result in tval = +/-inf. The degrees of freedom (df) will be 0 since the number of df is the number of samples minus 1. This results in a critical t value of NaN which continues on to the ttest outputs. Note that this is treated as a paired-ttest unlike the example below.

% Demo
[h,p,ci,stats] = ttest(5,6)

When only 1 value is provided in the first two inputs, it is interpreted as a one-sample ttest against the hypothesis that the mean of x is equal to 6 but since it only has 1 value in x, it falls victim to the same problem described above.

Note that ttest() does ignore NaN and inf values. As evidence, you'll find matching results between the two tests below.

[h1,p1,ci1,stats1] = ttest([nan;nan;5;6],[nan;nan;16;18])
[h2,p2,ci2,stats2] = ttest([        5;6],[        16;18])
% test for equality
all([isequal(h1,h2), isequal(p1,p2), isequal(ci1,ci2), isequal(stats1,stats2)])
% ans =
%   logical
%    1

*NaN values are returned when the inputs are all NaN or inf values.

% Demo
x = [randi(10,10,1);nan(10,1)]; 
y = [nan(10,1);randi(10,10,1)]; 
[h,p,ci,stats] = ttest(x,y)

In this demo, x and y contain NaN and non-NaN values but every pair has at least 1 NaN making the comparison impossible. There needs to be at least 2 complete pairs for reasons explained in the previous section.

How to interpret results

In all of these cases, a t-test is an inappropriate test since it requires x or y-x to form an approximately normal distribution. A NaN output indicates that this assumption has not been met (though it is not a test of this assumption).

If a NaN result is unexpected here's how you can diagnose which circumstance above is the cause.

For a paired test of the form h = ttest(x,y)

% Test that x and y are exactly equal
if isequal(x,y)
    warning('X and Y are exactly equal and will return a NaN in ttest(X,Y).')
end
% Test that there is >1 paired sample
if sum(all(~isnan([x(:),y(:)]),2)) < 2
    warning('There are less than 2 paired samples in ttest(x,y).')
end

For single-sample tests of the form h = ttest(x,m)

% Determine if all x values are exactly equal to m
if all(x==m)
    warning('All X values are identical to the mean (m) and will return a NaN in ttest(x,m).')
end
% Test that there is at least 2 values in x
if sum(~isnan(x)) < 2
    warning('There is less than 2 samples in ttest(x).')
end

2 comentarios
Mostrar NingunoOcultar Ninguno

Scott Kilianski el 16 de Feb. de 2021

Editada: Scott Kilianski el 16 de Feb. de 2021

According to the Matlab documentation, the null hypothesis for the paired t-test is actually that the distribution of x-y (not y-x as Adam stated in the beginning of his answer) is normal with mean 0.

I hate to be nit-picky, but this important if you are performing one-sided t-tests.

Adam Danz el 16 de Feb. de 2021

Thanks Scott; I'll correct that. The 4th line of code computes it correctly, not sure how I flipped x and y in the preceeding text.

Iniciar sesión para comentar.

Answer 2

Els Crijns el 4 de En. de 2018

1
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/327149-ttest-nan-result-is-nan-and-i-am-a-beginner#answer_298668

for further answers, there is a similar thread: https://nl.mathworks.com/matlabcentral/answers/354626-ttest-returns-nan-even-though-matrices-are-finite-and-does-not-contain-nan#answer_282277

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

ttest NaN result is NaN and i am a beginner

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

2 comentarios
Mostrar NingunoOcultar Ninguno

Más respuestas (1)

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Community Treasure Hunt

ttest NaN result is NaN and i am a beginner

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

2 comentarios Mostrar NingunoOcultar Ninguno

Más respuestas (1)

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

2 comentarios
Mostrar NingunoOcultar Ninguno

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos