Deviance is a generalization of the residual sum of squares. It measures the
goodness of fit compared to a saturated model.
The deviance of a model M1 is twice the
difference between the loglikelihood of the model
M1 and the saturated model
Ms. A saturated model is a
model with the maximum number of parameters that you can estimate.
For example, if you have n observations
(yi,
i = 1, 2, ..., n) with potentially different
values for
XiTβ,
then you can define a saturated model with n parameters. Let
L(b,y) denote the maximum value
of the likelihood function for a model with the parameters b. Then the
deviance of the model M1 is
where b1 and
bs contain the
estimated parameters for the model M1 and the
saturated model, respectively. The deviance has a chi-squared distribution with n – p degrees of freedom, where n is the number of parameters
in the saturated model and p is the number of parameters in the model
M1.
Assume you have two different generalized linear regression models
M1 and
M2, and
M1 has a subset of the terms in
M2. You can assess the fit of the models by
comparing their deviances D1 and
D2. The difference of the deviances is
Asymptotically, the difference D has a chi-squared distribution with
degrees of freedom v equal to the difference in the number of parameters
estimated in M1 and
M2. You can obtain the
p-value for this test by using 1 —
chi2cdf(D,v)
.
Typically, you examine D using a model
M2 with a constant term and no predictors.
Therefore, D has a chi-squared distribution with p – 1 degrees of freedom. If the dispersion is estimated, the difference divided
by the estimated dispersion has an F distribution with p – 1 numerator degrees of freedom and n – p denominator degrees of freedom.