Regression with dependent independent variables

4 visualizaciones (últimos 30 días)
Brian Scannell
Brian Scannell el 23 de Ag. de 2017
Editada: John D'Errico el 24 de Ag. de 2017
When undertaking a linear regression evaluating y as a function of x and x^3, is there a specific function within Matlab that takes account of the mutual dependence of the independent variables (x and x^3)?
I've tried searching the documentation but haven't found anything that specifically addresses this issue.
Regards, Brian
  1 comentario
Torsten
Torsten el 24 de Ag. de 2017
Seen as functions, x and x^3 are independent since there are no constants a and b (not both zero) that make a*x+b*x^3 identically zero over a given interval.
Additionally, I don't understand what such a tool you request should be good for. You mean from the view of performance ?
Best wishes
Torsten.

Iniciar sesión para comentar.

Respuesta aceptada

John D'Errico
John D'Errico el 24 de Ag. de 2017
Editada: John D'Errico el 24 de Ag. de 2017
Um, yes. ANY regression computation takes into account the relation between the variables.
So backslash, regress, lscov, lsqr, fit,etc. All take that into account.
I think your issue is that you don't understand how regression works. For that, there are entire courses that are taught.
Yes, it is true that x and x^3 are correlated with each other. Note that mathematical linear independence is not the same as saying the two variables are not related. That is, it is true that no linear combination of a*x + b*x^3 is zero EXCEPT for the case where a=b=0. So x and x^3 provide different information to the problem. Yet at the same time, it is not true that x and x^3 are orthogonal. There is essentially some overlap in what they do.
Yes, it is also true that there may be numerical issues. But that relationship between the variables is factored in when the regression is done. I really cannot say much more without specifics, or without writing a complete text on linear regression myself. Better that you read one, since many have been published. Perhaps a classic like that by Draper and Smith would be a good choice.
  2 comentarios
Brian Scannell
Brian Scannell el 24 de Ag. de 2017
Thanks for answering. Comments and suggestions much appreciated.
John D'Errico
John D'Errico el 24 de Ag. de 2017
Editada: John D'Errico el 24 de Ag. de 2017
I think you have gotten confused about regression, probably by a comment from a colleague. I seem to recall you saying that a colleague had said something about x and x^3, and now you seem to be worried.
While you should always beware of problems, odds are the inter-relationship between x and x^3 is not going to be an issue, at least if there are only two variables, and if you see no warning messages.
One test is to compute
cond([x(:),x(:).^3])
If you have other terms in the problem, they need to be included in there too. The best possible value here is 1. If you were seeing large numbers, REALLY large, on the order of 1e15 or so, you would start to get quite worried. Even 1e8 would be pretty bad. But for example, lets try it on some sample data.
x = randn(100,1);
cond([x(:),x(:).^3])
ans =
5.9516
So only 5.9. On the scale of how worried I would get here, 5.9 is laughably small.
Compare that to a different problem, with a much more complex model. Here, one with 16 polynomial terms in it.
x = rand(100,1);
cond([x.^[0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15]])
ans =
2.9659e+11
That is large enough that the coefficients may have have little real value. Any polynomial coefficients you estimate from that model would arguably be almost useless.
But for the model you have described? There is probably no big issue.

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Descriptive Statistics en Help Center y File Exchange.

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by