How can I scale CDF normal distribution values to match actual data? Calculating R^2?

Question

Macy el 15 de Feb. de 2023

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/1912745-how-can-i-scale-cdf-normal-distribution-values-to-match-actual-data-calculating-r-2

Comentada: Oguz Kaan Hancioglu el 15 de Feb. de 2023

practice3.xlsx

Hi everyone, How can I calculate R^2 for the actual data and the normal fit distribution? The problem I am having is my normal fit cdf values are on a scale of 0 to 1, and I would like to scale this so that is matches the scale of the actual data (0 to 2310). Because in the third to last step I must find the difference between the actual and normal predicted data.

Table = readtable("practice3.xlsx");

actual_values = Table.values;

actual_values = sort(actual_values)

actual_values = 10×1

50 80 350 370 450 700 1060 1100 2000 2310

hold on

cdfplot(actual_values); % Plot the empirical CDF

normalfit = fitdist(actual_values,'Normal'); % fit the normal distribution to the data

cdf_normal = cdf('Normal', actual_values, normalfit.mu, normalfit.sigma); % generate CDF values for each of the fitted distributions

plot(actual_values,cdf_normal) % plot the normal distribution

hold off

grid on

predicted_values = cdf_normal %HERE IS THE PROBLEM: cdf_normal ranges from 0 to 1, how can I scale cdf_normal to match the scale of the actual data, which has a max of 2310?

predicted_values = 10×1

0.1530 0.1623 0.2616 0.2701 0.3051 0.4251 0.6078 0.6274 0.9307 0.9699

% Compute R^2, which is 1 - (sum of squared residuals/total sum of squares)

SSR = sum(predicted_values - actual_values).^2;

TSS = sum(((actual_values - mean(actual_values)).^2));

Rsquared = 1 - SSR/TSS % Results in incorrect R value (R should be less than 1)

Rsquared = -12.1334

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Oguz Kaan Hancioglu el 15 de Feb. de 2023

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/1912745-how-can-i-scale-cdf-normal-distribution-values-to-match-actual-data-calculating-r-2#answer_1171920

Abrir en MATLAB Online

I think there is a problem in your

calculation. Your calculation uses the x value of the actual values and F(x) value of the predicted values.

cdfplot(actual_values); % Plot the empirical CDF

cdfplot empirical CDF using your x-axis values. If you use the handle of the cdfplot you can access the F(x) value of your data. Change this as,

[h,stats] = cdfplot(actual_values); % Plot the empirical CDF
% don't close the cdfplot to use its handle
Fx = h.YData; 

After you can use this Fx value in your your

calculation.

% Compute R^2, which is 1 - (sum of squared residuals/total sum of squares)
SSR = sum(predicted_values - Fx).^2;
TSS = sum(((Fx - mean(Fx)).^2));
Rsquared = 1 - SSR/TSS % Results in incorrect R value (R should be less than 1)

2 comentarios
Mostrar NingunoOcultar Ninguno

Macy el 15 de Feb. de 2023

I could not get this too work, I am getting an array of 22 Rsquared values.

Oguz Kaan Hancioglu el 15 de Feb. de 2023

Abrir en MATLAB Online

That's caused by the cdfplot function. When you enter the actual_values into this function the cdfplot modifies the values of the actual_values and generates XData. You can examine h.Xdata. You will see that cdfplot writes the same element twice and adds -inf and +inf to your actual_values.

You can get your values by manual indexing.

Fxx = Fx(2:2:20);

The vectors are the same length and correspond to the actual_values. Now you can calculate the R^2 as follow.

Fxx = Fx(2:2:20);
% Compute R^2, which is 1 - (sum of squared residuals/total sum of squares)
SSR = sum(predicted_values - Fxx).^2;
TSS = sum(((Fxx - mean(Fxx)).^2));
Rsquared = 1 - SSR/TSS % Results in incorrect R value (R should be less than 1)

I calculated 0.9450. It worked. However I don't know any idea why cdfplot use the same element twice.

Best regard

Iniciar sesión para comentar.

How can I scale CDF normal distribution values to match actual data? Calculating R^2?

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (1)

2 comentarios
Mostrar NingunoOcultar Ninguno

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

How can I scale CDF normal distribution values to match actual data? Calculating R^2?

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (1)

2 comentarios Mostrar NingunoOcultar Ninguno

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

2 comentarios
Mostrar NingunoOcultar Ninguno