Documentación

Esta página aún no se ha traducido para esta versión. Puede ver la versión más reciente de esta página en inglés.

# ridge

## Sintaxis

b = ridge(y,X,k) b = ridge(y,X,k,scaled) 

## Description

b = ridge(y,X,k) returns a vector b of coefficient estimates for a multilinear ridge regression of the responses in y on the predictors in X. X is an n-by-p matrix of p predictors at each of n observations. y is an n-by-1 vector of observed responses. k is a vector of ridge parameters. If k has m elements, b is p-by-m. By default, b is computed after centering and scaling the predictors to have mean 0 and standard deviation 1. The model does not include a constant term, and X should not contain a column of 1s.

b = ridge(y,X,k,scaled) uses the {0,1}-valued flag scaled to determine if the coefficient estimates in b are restored to the scale of the original data. ridge(y,X,k,0) performs this additional transformation. In this case, b contains p+1 coefficients for each value of k, with the first row corresponding to a constant term in the model. ridge(y,X,k,1) is the same as ridge(y,X,k). In this case, b contains p coefficients, without a coefficient for a constant term.

The relationship between b0 = ridge(y,X,k,0) and b1 = ridge(y,X,k,1) is given by

 m = mean(X); s = std(X,0,1)'; b1_scaled = b1./s; b0 = [mean(y)-m*b1_scaled; b1_scaled]

This can be seen by replacing the xi (i = 1, ..., n) in the multilinear model y = b00 + b10x1 + ... + bn0xn with the z-scores zi = (xiμi)/σi , and replacing y with yμy.

In general, b1 is more useful for producing plots in which the coefficients are to be displayed on the same scale, such as a ridge trace (a plot of the regression coefficients as a function of the ridge parameter). b0 is more useful for making predictions.

Coefficient estimates for multiple linear regression models rely on the independence of the model terms. When terms are correlated and the columns of the design matrix X have an approximate linear dependence, the matrix (XTX)–1 becomes close to singular. As a result, the least-squares estimate

$\stackrel{^}{\beta }={\left({X}^{T}X\right)}^{-1}{X}^{T}y$

becomes highly sensitive to random errors in the observed response y, producing a large variance. This situation of multicollinearity can arise, for example, when data are collected without an experimental design.

Ridge regression addresses the problem by estimating regression coefficients using

$\stackrel{^}{\beta }={\left({X}^{T}X+kI\right)}^{-1}{X}^{T}y$

where k is the ridge parameter and I is the identity matrix. Small positive values of k improve the conditioning of the problem and reduce the variance of the estimates. While biased, the reduced variance of ridge estimates often result in a smaller mean square error when compared to least-squares estimates.

## Ejemplos

contraer todo

Load the sample data.

load acetylene 

acetylene has observations for the predictor variables x1 , x2 , x3 , and the response variable y .

Plot the predictor variables against each other.

subplot(1,3,1) plot(x1,x2,'.') xlabel('x1'); ylabel('x2'); grid on; axis square subplot(1,3,2) plot(x1,x3,'.') xlabel('x1'); ylabel('x3'); grid on; axis square subplot(1,3,3) plot(x2,x3,'.') xlabel('x2'); ylabel('x3'); grid on; axis square Note the correlation between x1 and the other two predictor variables.

Compute coefficient estimates for a multilinear model with interaction terms, for a range of ridge parameters using ridge and x2fx .

X = [x1 x2 x3]; D = x2fx(X,'interaction'); D(:,1) = []; % No constant term k = 0:1e-5:5e-3; b = ridge(y,D,k); 

Plot the ridge trace.

figure plot(k,b,'LineWidth',2) ylim([-100 100]) grid on xlabel('Ridge Parameter') ylabel('Standardized Coefficient') title('{\bf Ridge Trace}') legend('x1','x2','x3','x1x2','x1x3','x2x3') The estimates stabilize to the right of the plot. Note that the coefficient of the x2x3 interaction term changes sign at a value of the ridge parameter .

## References

 Hoerl, A. E., and R. W. Kennard. “Ridge Regression: Biased Estimation for Nonorthogonal Problems.” Technometrics. Vol. 12, No. 1, 1970, pp. 55–67.

 Hoerl, A. E., and R. W. Kennard. “Ridge Regression: Applications to Nonorthogonal Problems.” Technometrics. Vol. 12, No. 1, 1970, pp. 69–82.

 Marquardt, D.W. “Generalized Inverses, Ridge Regression, Biased Linear Estimation, and Nonlinear Estimation.” Technometrics. Vol. 12, No. 3, 1970, pp. 591–612.

 Marquardt, D. W., and R.D. Snee. “Ridge Regression in Practice.” The American Statistician. Vol. 29, No. 1, 1975, pp. 3–20.

Download ebook