glmval

Generalized linear model values

Syntax

ypred = glmval(b,Xnew,Link)

[ypred,yLo,yUp] = glmval(b,Xnew,Link,stats)

[___] = glmval(___,Name=Value)

Description

ypred = glmval(b,Xnew,Link) computes predicted values for the generalized linear model with the link function Link and new predictor input values Xnew. b is a vector of coefficient estimates returned by the glmfit function.

example

[ypred,yLo,yUp] = glmval(b,Xnew,Link,stats) also computes 95% confidence bounds for the predicted values. stats is a structure returned by the glmfit function. yLo and yUp define a lower confidence bound of ypred-yLo, and an upper confidence bound of ypred+yUp. Confidence bounds are nonsimultaneous by default, and apply to the fitted curve, not to a new observation.

example

[___] = glmval(___,Name=Value) specifies options using one or more name-value arguments in addition to any of the input argument combinations in previous syntaxes. For example, you can specify the confidence level for the confidence bounds.

example

Examples

collapse all

Predict Response Values

Open Live Script

Generate sample data using Poisson random numbers with two underlying predictors X(:,1) and X(:,2).

rng(0,"twister") % For reproducibility
rndvars = randn(100,2);
X = [2 + rndvars(:,1),rndvars(:,2)];
mu = exp(1 + X*[1;2]);
y = poissrnd(mu);

Fit a generalized linear regression model to the sample data.

b  = glmfit(X,y,"poisson");

b is a 3-by-1 vector of regression coefficients.

Create data points for prediction.

[Xtest1,Xtest2] = meshgrid(-1:.5:3,-2:.5:2);
Xnew = [Xtest1(:),Xtest2(:)];

Predict the responses at the data points using the fitted model. Use the log link function, which is appropriate for a Poisson distribution.

ypred = glmval(b,Xnew,"log");

Plot the predictions.

surf(Xtest1,Xtest2,reshape(ypred,9,9))
xlabel("Xtest1")
ylabel("Xtest2")
zlabel("ypred")

Figure contains an axes object. The axes object with xlabel Xtest1, ylabel Xtest2 contains an object of type surface.

Fit Generalized Linear Model with Probit Link Function

Open Live Script

Fit a generalized linear regression model, and compute predicted (estimated) values for the predictor data using the fitted model.

Create a sample data set.

x = [2100 2300 2500 2700 2900 3100 ...
     3300 3500 3700 3900 4100 4300]';
n = [48 42 31 34 31 21 23 23 21 16 17 21]';
y = [1 2 0 3 8 8 14 17 19 15 17 21]';

x contains the predictor variable values. Each y value is the number of successes in the corresponding number of trials in n.

Fit a probit regression model for y on x using a binomial distribution for y.

b = glmfit(x,[y n],"binomial",Link="probit");

b is a 2-by-1 vector of regression coefficients.

Compute the estimated number of successes.

ypred = glmval(b,x,"probit",BinomialSize=n);

Plot the observed and estimated percent success fractions versus the x values.

plot(x,y./n,"o",x,ypred./n,"-")
xlabel("x")
ylabel("Success Fraction")

Figure contains an axes object. The axes object with xlabel x, ylabel Success Fraction contains 2 objects of type line. One or more of the lines displays its values using only markers

Use Custom-Defined Link Function

Open Live Script

Create a sample data set.

x = [2100 2300 2500 2700 2900 3100 ...
     3300 3500 3700 3900 4100 4300]';
n = [48 42 31 34 31 21 23 23 21 16 17 21]';
y = [1 2 0 3 8 8 14 17 19 15 17 21]';

x contains the predictor variable values. Each y value is the number of successes in the corresponding number of trials in n.

Define three function handles, created by using @, that define the link, the derivative of the link, and the inverse link for a probit link function. Store the function handles in a structure.

link = @(mu) norminv(mu);
derlink = @(mu) 1 ./ normpdf(norminv(mu));
invlink = @(resp) normcdf(resp);
F = struct(Link=link,Derivative=derlink,Inverse=invlink);

Fit a generalized linear model for y on x using a binomial distribution for y and the link function that you defined.

b = glmfit(x,[y n],"binomial",F);

b is a 2-by-1 vector of regression coefficients.

Compute the estimated number of successes.

ypred = glmval(b,x,F,BinomialSize=n);

Plot the observed and estimated percent success fractions versus the x values.

plot(x, y./n,"o",x,ypred./n,"-")
xlabel("x")
ylabel("Success fraction")

$Figure contains an axes object. The axes object with xlabel x, ylabel Success fraction contains 2 objects of type line. One or more of the lines displays its values using only markers$

Generate Code from Function That Predicts Responses Given New Data

This example uses:

Open Script

Train a generalized linear model, and then generate code from a function that classifies new observations based on the model. This example is based on the Use Custom-Defined Link Function example.

Enter the sample data.

x = [2100 2300 2500 2700 2900 3100 ...
     3300 3500 3700 3900 4100 4300]';
n = [48 42 31 34 31 21 23 23 21 16 17 21]';
y = [1 2 0 3 8 8 14 17 19 15 17 21]';

Suppose that the inverse normal pdf is an appropriate link function for the problem.

Define a function named myInvNorm.m that accepts values of $X\beta$ and returns corresponding values of the inverse of the standard normal cdf.

function in = myInvNorm(mu) %#codegen
%myInvNorm Inverse of standard normal cdf for code generation
%   myInvNorm is a GLM link function that accepts a numeric vector mu, and
%   returns in, which is a numeric vector of corresponding values of the
%   inverse of the standard normal cdf.
%   
in = norminv(mu);
end

Define another function named myDInvNorm.m that accepts values of $X\beta$ and returns corresponding values of the derivative of the link function.

function din = myDInvNorm(mu) %#codegen
%myDInvNorm Derivative of inverse of standard normal cdf for code
%generation
%   myDInvNorm corresponds to the derivative of the GLM link function
%   myInvNorm. myDInvNorm accepts a numeric vector mu, and returns din,
%   which is a numeric vector of corresponding derivatives of the inverse
%   of the standard normal cdf.
%   
din = 1./normpdf(norminv(mu));
end

Define another function named myInvInvNorm.m that accepts values of $X\beta$ and returns corresponding values of the inverse of the link function.

function iin = myInvInvNorm(mu) %#codegen
%myInvInvNorm Standard normal cdf for code generation
%   myInvInvNorm is the inverse of the GLM link function myInvNorm.
%   myInvInvNorm accepts a numeric vector mu, and returns iin, which is a
%   numeric vector of corresponding values of the standard normal cdf.
% 
iin = normcdf(mu);
end

Create a structure array that specifies each of the link functions. Specifically, the structure array contains fields named 'Link', 'Derivative', and 'Inverse'. The corresponding values are the names of the functions.

link = struct('Link','myInvNorm','Derivative','myDInvNorm',...
    'Inverse','myInvInvNorm')

link = 

  struct with fields:

          Link: 'myInvNorm'
    Derivative: 'myDInvNorm'
       Inverse: 'myInvInvNorm'

Fit a GLM for y on x using the link function link. Return the structure array of statistics.

[b,~,stats] = glmfit(x,[y n],'binomial','link',link);

b is a 2-by-1 vector of regression coefficients.

In your current working folder, define a function called classifyGLM.m that:

Accepts measurements with columns corresponding to those in x, regression coefficients whose dimensions correspond to b, a link function, the structure of GLM statistics, and any valid glmval name-value pair argument
Returns predictions and confidence interval margins of error

function [yhat,lo,hi] = classifyGLM(b,x,link,varargin) %#codegen
%CLASSIFYGLM Classify measurements using GLM model 
%   CLASSIFYGLM classifies the n observations in the n-by-1 vector x using
%   the GLM model with regression coefficients b and link function link,
%   and then returns the n-by-1 vector of predicted values in yhat.
%   CLASSIFYGLM also returns margins of error for the predictions using
%   additional information in the GLM statistics structure stats.
narginchk(3,Inf);
if(isstruct(varargin{1}))
    stats = varargin{1};
    [yhat,lo,hi] = glmval(b,x,link,stats,varargin{2:end});
else
    yhat = glmval(b,x,link,varargin{:});
end
end

Generate a MEX function from classifyGLM.m. Because C uses static typing, codegen must determine the properties of all variables in MATLAB® files at compile time. To ensure that the MEX function can use the same inputs, use the -args argument to specify the following in the order given:

Regression coefficients b as a compile-time constant
In-sample observations x
Link function as a compile-time constant
Resulting GLM statistics as a compile-time constant
Name 'Confidence' as a compile-time constant
Confidence level 0.9

To designate arguments as compile-time constants, use coder.Constant.

codegen -config:mex classifyGLM -args {coder.Constant(b),x,coder.Constant(link),coder.Constant(stats),coder.Constant('Confidence'),0.9}

Code generation successful.

codegen generates the MEX file classifyGLM_mex.mexw64 in your current folder. The file extension depends on your system platform.

Compare predictions by using glmval and classifyGLM_mex. Specify name-value pair arguments in the same order as in the -args argument in the call to codegen.

[yhat1,melo1,mehi1] = glmval(b,x,link,stats,'Confidence',0.9);
[yhat2,melo2,mehi2] = classifyGLM_mex(b,x,link,stats,'Confidence',0.9);

comp1 = (yhat1 - yhat2)'*(yhat1 - yhat2);
agree1 = comp1 < eps
comp2 = (melo1 - melo2)'*(melo1 - melo2);
agree2 = comp2 < eps
comp3 = (mehi1 - mehi2)'*(mehi1 - mehi2);
agree3 = comp3 < eps

agree1 =

  logical

   1


agree2 =

  logical

   1


agree3 =

  logical

   1

The generated MEX function produces the same predictions as predict.

Input Arguments

collapse all

`b` — Coefficient estimates
numeric column vector

Coefficient estimates, specified as a numeric column vector returned by the glmfit function.

If Constant is "on" (default), then b must contain p+1 rows, where p is the number of columns in Xnew. The first element of b corresponds to the coefficient of the constant term (intercept) in the model.
If Constant is "off", then b must contain p rows. glmval omits the constant term from the model.

Data Types: single | double

`Xnew` — New predictor input values
numeric matrix

New predictor input values, specified as a numeric matrix. Each row of Xnew corresponds to one observation, and each column corresponds to one predictor variable.

By default, glmval includes a constant term in the model. Do not add a column of 1s directly to Xnew. You can change the default behavior of glmval by specifying the Constant name-value argument.

Data Types: single | double

`Link` — Link function
scalar value | string | structure

Link function, specified as one of the following values.

Link Function Name	Link Function	Mean (Inverse) Function
`"identity"`	f(μ) = μ	μ = Xb
`"log"`	f(μ) = log(μ)	μ = exp(Xb)
`"logit"`	f(μ) = log(μ/(1–μ))	μ = exp(Xb) / (1 + exp(Xb))
`"probit"`	f(μ) = Φ^–1(μ), where Φ is the cumulative distribution function of the standard normal distribution.	μ = Φ(Xb)
`"comploglog"`	f(μ) = log(–log(1 – μ))	μ = 1 – exp(–exp(Xb))
`"reciprocal"`	f(μ) = 1/μ	μ = 1/(Xb)
`p` (a number)	f(μ) = μ^p	μ = Xb^1/p
`S` (a structure) with three fields. Each field holds a function handle that accepts a vector of inputs and returns a vector of the same size: `S.Link` — The link function `S.Inverse` — The inverse link function `S.Derivative` — The derivative of the link function	f(μ) = `S.Link`(μ)	μ = `S.Inverse`(Xb)

The link function defines the relationship f(μ) = Xnew*b between the mean response μ and the linear combination of predictors Xnew*b.

Data Types: char | string | single | double | struct

`stats` — Model statistics
structure

Model statistics, specified as a structure returned by the glmfit function. You must specify stats to return the confidence bounds yLo and yUp.

Name-Value Arguments

collapse all

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: glmval(b,X,Link,Confidence=0.99) specifies a confidence level of 99% for the confidence bounds.

`BinomialSize` — Binomial size
numeric scalar | numeric column vector

Binomial size, specified as a numeric scalar which represents the number of trials in a binomial model. You can also specify BinomialSize as a numeric column vector with one value for each row of Xnew.

Example: BinomialSize=[10 20 10 40]'

Data Types: single | double

`Confidence` — Confidence level
0.95 (default) | value from 0 to 1

Confidence level of the confidence bounds for ypred, specified as a numeric value from 0 to 1. The default value of 0.95 corresponds to the 95% confidence level.

Example: Confidence=0.99

Data Types: single | double

`Constant` — Flag to include constant term (intercept) in model
`"on"` (default) | `"off"`

Flag to include a constant term (intercept) in the model, specified as either "on" or "off".

"on" (default) — glmval includes a constant term in the model. The coefficient of the constant term is the first element of b.
"off" — glmval omits the constant term in the model.

Example: Constant="off"

Data Types: char | string

`Offset` — Offset variable in model
[ ] (default) | numeric column vector

Offset variable in the model, specified as a numeric column vector with one value for each row of Xnew.

glmval uses Offset as an additional predictor with a coefficient value fixed at 1. In other words, the model formula is

f(μ)= Offset + XNew*b,

where f is the link function, μ is the mean response, and XNew*b is the linear combination of predictors XNew.

Example: Offset=ones(50,1)

Data Types: single | double

`Simultaneous` — Flag to compute simultaneous confidence bounds
`false` or `0` (default) | `true` or `1`

Flag to compute simultaneous confidence bounds, specified as a numeric or logical 1 (true) or 0 (false).

true — glmval calculates confidence bounds for the curve of response values corresponding to all predictor values in Xnew, using Schefféʼs method. The range between the upper and lower bounds contains the curve that consists of true response values with 100*Confidence% confidence.
false — glmval calculates confidence bounds for the response value at each observation in Xnew. The confidence interval for a response value at a specific predictor value contains the true response value with 100*Confidence% confidence.

With simultaneous bounds, the entire curve of true response values is within the bounds at high confidence. By contrast, nonsimultaneous bounds require only the response value at a single predictor value to be within the bounds at high confidence. Therefore, simultaneous bounds are wider than nonsimultaneous bounds.

Example: Simultaneous=true

Data Types: char | string | logical

Output Arguments

collapse all

`ypred` — Predicted values
numeric column vector

Predicted generalized linear model values, returned as a numeric column vector with the same number of rows as Xnew.

`yLo` — Lower confidence bound for `ypred`
numeric column vector

Lower confidence bound for ypred, returned as a numeric column vector with the same number of rows as Xnew.

`yUp` — Upper confidence bound for `ypred`
numeric column vector

Upper confidence bound for ypred, returned as a numeric column vector with the same number of rows as Xnew.

References

[1] Dobson, A. J. An Introduction to Generalized Linear Models. New York: Chapman & Hall, 1990.

[2] McCullagh, P., and J. A. Nelder. Generalized Linear Models. New York: Chapman & Hall, 1990.

[3] Collett, D. Modeling Binary Data. New York: Chapman & Hall, 2002.

Extended Capabilities

expand all

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

Usage notes and limitations:

Argument	Notes and Limitations
`b`	`b` must be a compile-time constant (the value of `b` cannot change while `codegen` (MATLAB Coder) generates the C/C++ code). You can designate variables as compile-time constants using `coder.Constant` (MATLAB Coder).
`Xnew`	Must be a double- or single-precision numeric matrix
`Link`	Does not support function handles, that is, functions created using `@`. Must be a compile-time constant.
`stats`	Must be a compile-time constant
`Constant` name-value argument	Must be a compile-time constant
Name-value arguments	Names in name-value arguments must be compile-time constants. For example, to specify a confidence level of 0.9, include `{coder.Constant("Confidence"),coder.Constant(0.9)}` in the `-args` value of `codegen`.

For more information on code generation, see Introduction to Code Generation for Statistics and Machine Learning Functions and Overview of Code Generation Using MATLAB Coder (MATLAB Coder).

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

This function fully supports GPU arrays. For more information, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox).

Version History

Introduced before R2006a

glmval

Syntax

Description

Examples

Predict Response Values

Fit Generalized Linear Model with Probit Link Function

Use Custom-Defined Link Function

Generate Code from Function That Predicts Responses Given New Data

Input Arguments

`b` — Coefficient estimates
numeric column vector

`Xnew` — New predictor input values
numeric matrix

`Link` — Link function
scalar value | string | structure

`stats` — Model statistics
structure

Name-Value Arguments

`BinomialSize` — Binomial size
numeric scalar | numeric column vector

`Confidence` — Confidence level
0.95 (default) | value from 0 to 1

`Constant` — Flag to include constant term (intercept) in model
`"on"` (default) | `"off"`

`Offset` — Offset variable in model
[ ] (default) | numeric column vector

`Simultaneous` — Flag to compute simultaneous confidence bounds
`false` or `0` (default) | `true` or `1`

Output Arguments

`ypred` — Predicted values
numeric column vector

`yLo` — Lower confidence bound for `ypred`
numeric column vector

`yUp` — Upper confidence bound for `ypred`
numeric column vector

References

Extended Capabilities

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

Version History

See Also

Topics

glmval

Syntax

Description

Examples

Predict Response Values

Fit Generalized Linear Model with Probit Link Function

Use Custom-Defined Link Function

Generate Code from Function That Predicts Responses Given New Data

Input Arguments

b — Coefficient estimates numeric column vector

Xnew — New predictor input values numeric matrix

Link — Link function scalar value | string | structure

stats — Model statistics structure

Name-Value Arguments

BinomialSize — Binomial size numeric scalar | numeric column vector

Confidence — Confidence level 0.95 (default) | value from 0 to 1

Constant — Flag to include constant term (intercept) in model "on" (default) | "off"

Offset — Offset variable in model [ ] (default) | numeric column vector

Simultaneous — Flag to compute simultaneous confidence bounds false or 0 (default) | true or 1

Output Arguments

ypred — Predicted values numeric column vector

yLo — Lower confidence bound for ypred numeric column vector

yUp — Upper confidence bound for ypred numeric column vector

References

Extended Capabilities

C/C++ Code Generation Generate C and C++ code using MATLAB® Coder™.

GPU Arrays Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

Version History

See Also

Topics

`b` — Coefficient estimates
numeric column vector

`Xnew` — New predictor input values
numeric matrix

`Link` — Link function
scalar value | string | structure

`stats` — Model statistics
structure

`BinomialSize` — Binomial size
numeric scalar | numeric column vector

`Confidence` — Confidence level
0.95 (default) | value from 0 to 1

`Constant` — Flag to include constant term (intercept) in model
`"on"` (default) | `"off"`

`Offset` — Offset variable in model
[ ] (default) | numeric column vector

`Simultaneous` — Flag to compute simultaneous confidence bounds
`false` or `0` (default) | `true` or `1`

`ypred` — Predicted values
numeric column vector

`yLo` — Lower confidence bound for `ypred`
numeric column vector

`yUp` — Upper confidence bound for `ypred`
numeric column vector

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.