Statistical Test for decaying signals

Question

0 votos

I have two decaying relative intensity curves and would like a statistical test to show that they are different, each time point on each curve is produced by averaging from 100 taken data points - does anyone have any suggestions? The data is:

Data 1: 1 0.914144 0.876253 0.836468 0.806563 0.781585 0.744672 0.727541 0.695955 0.677459 0.630814 0.637396 0.609646 0.569227 0.565882 0.529177 0.520497 0.514375 0.504086 0.474612 0.447513 0.425238 0.432216 0.441622 0.407928 0.381347 0.387921 0.387443 0.380426 0.363821 0.353484

Data 2: 0.984578 0.9664 0.985515 0.98057 1 0.980536 0.930023 0.957503 0.903321 0.886397 0.897744 0.821625 0.85142 0.833694 0.826525 0.81353 0.768527 0.793422 0.81677 0.76768 0.773302 0.777807 0.736474 0.693616 0.694688 0.74992 0.712753 0.700593 0.708191 0.677843 0.720385

Time: 0 60 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Follow Question

Answer 1

Star Strider el 24 de Sept. de 2024

Abrir en MATLAB Online

0 votos

‘... each time point on each curve is produced by averaging from 100 taken data points ...’

The statistical test depends on what the data represent, and the characteristics of those data. However, just using the mean values is not going to be of any real value, since you also need to have measures of the dispersion of the data, specifically the variance, and if applicable, the standard deviation (since not all distributions — such as the lognormal distribution — have standard deviations).

If you do not know the underlying distributions of the data, my suggestion would be to use a nonparametric test. There are several that could work, however the friedman test might be the most appropriate here.

Data_1 = [1 0.914144 0.876253 0.836468 0.806563 0.781585 0.744672 0.727541 0.695955 0.677459 0.630814 0.637396 0.609646 0.569227 0.565882 0.529177 0.520497 0.514375 0.504086 0.474612 0.447513 0.425238 0.432216 0.441622 0.407928 0.381347 0.387921 0.387443 0.380426 0.363821 0.353484];

Data_2 = [0.984578 0.9664 0.985515 0.98057 1 0.980536 0.930023 0.957503 0.903321 0.886397 0.897744 0.821625 0.85142 0.833694 0.826525 0.81353 0.768527 0.793422 0.81677 0.76768 0.773302 0.777807 0.736474 0.693616 0.694688 0.74992 0.712753 0.700593 0.708191 0.677843 0.720385];

Time = [ 0 60 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800];

figure

plot(Time, Data_1, '.-', 'DisplayName','Data_1')

hold on

plot(Time, Data_2, '.-', 'DisplayName','Data_2')

hold off

grid

legend('Location','best')

.

4 comentarios
Mostrar 2 comentarios más antiguos Ocultar 2 comentarios más antiguos

John D'Errico el 24 de Sept. de 2024

@Star Strider

"(since not all distributions — such as the lognormal distribution — have standard deviations"

Incorrect. A lognormal distribution DOES indeed have a standard deviation.

https://en.wikipedia.org/wiki/Log-normal_distribution

In there, you will see the variance of a lognormal. The standard deviation is the sqrt of the variance. If the variance is well defined, then so is the standard deviation. Both the variance and the standard deviation of a lognormal will be big things, and not terribly useful in terms of how we usually think about those parameters, just because we tend to think of variances in terms of a normal distribution. For example, we tend to always think of a mean, +/- some number of standard deviations. Those things are naturally burned into our brains when we do any kind of statistics.

However, for a lognormal distribution, defined in terms of the mean (mu) and variance (sigma^2) of the underlying normal, the mean of the lognormal wil be:

exp(mu + sigma^2)

and the standard deviation is:

sqrt(exp(sigma^2) - 1)*exp(mu + sigma^2/2)

Now we can compute the point where k standard deviations takes you below 0. I did this for a standard lognormal

exp(1/2)/(exp(1) - 1)^(1/2) = 1.2577665549971212461540582615847

So anything below the mean minus 1.26 standard deviations for a standard lognormal yields a negative number.

Had you suggested a Cauchy distribution (or some others, a Cauchy is the one that immediately comes to mind for me) does not have a variance OR a standard deviation, then you would have been absolutely correct.

Henry Carey-Morgan el 8 de Oct. de 2024

I have standard deviations for each point as well and can generate 95% confidence intervals that show the curves are different. I need a p value. Could you please explain how you are inputting the data into the friedman test? Is the reps the number of points I average over, in my case 100? Thank you so much for your help!

Star Strider el 8 de Oct. de 2024

Abrir en MATLAB Online

To do what I suggested, you need the original data at each point.

That would go something like this —

Data_1 = [1 0.914144 0.876253 0.836468 0.806563 0.781585 0.744672 0.727541 0.695955 0.677459 0.630814 0.637396 0.609646 0.569227 0.565882 0.529177 0.520497 0.514375 0.504086 0.474612 0.447513 0.425238 0.432216 0.441622 0.407928 0.381347 0.387921 0.387443 0.380426 0.363821 0.353484];

Data_2 = [0.984578 0.9664 0.985515 0.98057 1 0.980536 0.930023 0.957503 0.903321 0.886397 0.897744 0.821625 0.85142 0.833694 0.826525 0.81353 0.768527 0.793422 0.81677 0.76768 0.773302 0.777807 0.736474 0.693616 0.694688 0.74992 0.712753 0.700593 0.708191 0.677843 0.720385];

Time = [ 0 60 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800];

nPoints = numel(Time)

nPoints = 31

Data_Orig_1 = Data_1 + (rand(10, numel(Data_1))-0.5);

Data_Orig_2 = Data_2 + (rand(10, numel(Data_1))-0.5);

orng = [0.9 0.5 0.2];

friedman_data = [Data_Orig_1(:) Data_Orig_2(:)]

friedman_data = 310×2

0.5660 1.4103 0.6106 1.3911 1.0290 0.9247 1.1947 1.1333 0.7528 0.8116 1.3387 1.2896 1.0860 0.5246 0.7695 0.5941 1.4248 0.9095 0.6528 1.0547

<mw-icon class=""></mw-icon>

[p,T,S] = friedman(friedman_data, size(Data_Orig_1,1))

p = 2.6813e-16

T = 5x6 cell array

{'Source' } {'SS' } {'df' } {'MS' } {'Chi-sq' } {'Prob>Chi-sq'} {'Columns' } {[2.3459e+03]} {[ 1]} {[2.3459e+03]} {[ 67.0247]} {[ 2.6813e-16]} {'Interaction'} {[ 914.3355]} {[ 30]} {[ 30.4778]} {0x0 double} {0x0 double } {'Error' } {[1.7355e+04]} {[558]} {[ 31.1018]} {0x0 double} {0x0 double } {'Total' } {[ 20615]} {[619]} {0x0 double } {0x0 double} {0x0 double }

S = struct with fields:

source: 'friedman' n: 31 meanranks: [8.5548 12.4452] sigma: 5.9161

figure

hp1 = plot(Time, Data_1, '.-b', 'DisplayName','Data_1', 'LineWidth',1.5);

hold on

plot(Time, Data_Orig_1, '.b')

hp2 = plot(Time, Data_2, '.-', 'DisplayName','Data_2', 'Color',orng, 'LineWidth',1.5);

plot(Time, Data_Orig_2, '.', 'Color',orng)

hold off

grid

xlabel('Time')

ylabel('Value')

legend([hp1 hp2],'Location','best')

Here, the matrix for the friedman test consists of vertically-concatenated columns of the data around the original points, of which there are uniformly 10 each (the second argument to friedman), creating a (310x2) matrix. The friedman function then compares these two, and determines that they are statistically signifiicant (in this instance). I also considered using multcompare however with only two groups, it is likely not necessary here.

I have never done anything even remotely like this, nor seen it done. (I have only compared two models using the same data with the likelihood ratio test.) I believe the friedman test is appropriate for this problem. In any event, I cannot envision any other way to approach it.

.

Iniciar sesión para comentar.

Answer 2

Jeff Miller el 25 de Sept. de 2024

Abrir en MATLAB Online

0 votos

One simple approach is to fit a straight line to each dataset and show that the slopes are statistically different. For example,

% I'm dividing Time by 1000 to get more readable slope values--i.e.,
% decrease per 1000 time units.
mdl1 = fitlm(Time/1000,Data_1);
ci1 = coefCI(mdl1);
mdl2 = fitlm(Time/1000,Data_2);
ci2 = coefCI(mdl2);
fprintf('Slope for data 1 = %f with 95 pct confidence interval %f to %f\n',mdl1.Coefficients.Estimate(2),ci1(2,1),ci1(2,2));
fprintf('Slope for data 2 = %f with 95 pct confidence interval %f to %f\n',mdl2.Coefficients.Estimate(2),ci2(2,1),ci2(2,2));
% Slope for data 1 = -0.326349 with 95 pct confidence interval -0.355400 to -0.297297
% Slope for data 2 = -0.185232 with 95 pct confidence interval -0.204295 to -0.166170

Since the confidence intervals don't overlap (and it's not even close), you are statistically justified in concluding that the decrease is steeper for data 1 than 2.

If you need an actual p value for a test of the difference in slopes, you'll need to do a bit more work.

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Statistical Test for decaying signals

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Respuestas (2)

4 comentarios
Mostrar 2 comentarios más antiguos Ocultar 2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Categorías

Etiquetas

Community Treasure Hunt

Statistical Test for decaying signals

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Respuestas (2)

4 comentarios Mostrar 2 comentarios más antiguos Ocultar 2 comentarios más antiguos

0 comentarios Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

Categorías

Etiquetas

Ver también

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos

4 comentarios
Mostrar 2 comentarios más antiguos Ocultar 2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguos Ocultar -2 comentarios más antiguos