Documentation

rowfun

Apply function to table or timetable rows

Description

B = rowfun(func,A) applies the function func to each row of the table or timetable A and returns the results in the table or timetable B.

func accepts size(A,2) inputs.

If A is a timetable and func aggregates data over groups of rows, then rowfun assigns the first row time from each group of rows in A as the corresponding row time in B. To return B as a table without row times, specify 'OutputFormat' as 'table'.

example

B = rowfun(func,A,Name,Value) applies the function func to each row of the table A with additional options specified by one or more Name,Value pair arguments.

For example, you can specify which variables to pass to the function func and how to call func.

Examples

collapse all

Apply the function hypot to each row of the 5-by-2 table A to find the shortest distance between the variables x and y.

Create a table, A, with two variables of numeric data.

x = gallery('integerdata',10,[5,1],2);
y = gallery('integerdata',10,[5,1],8);

A = table(x,y)
A=5×2 table
x    y
_    __

9     1
4     5
3     2
7     3
1    10

Apply the function, hypot, to each row of A. The function hypot takes two inputs and returns one output.

B = rowfun(@hypot,A,'OutputVariableNames','z')
B=5×1 table
z
______

9.0554
6.4031
3.6056
7.6158
10.05

B is a table.

Append the function output, B, to the input table, A.

[A B]
ans=5×3 table
x    y       z
_    __    ______

9     1    9.0554
4     5    6.4031
3     2    3.6056
7     3    7.6158
1    10     10.05

Define and apply a geometric Brownian motion model to a range of parameters.

Create a function in a file named gbmSim.m that contains the following code.

% Copyright 2015 The MathWorks, Inc.

function [m,mtrue,s,strue] = gbmSim(mu,sigma)
% Discrete approximation to geometric Brownian motion
%
% [m,mtrue,s,strue] = gbmSim(mu,sigma) computes the
% simulated mean, true mean, simulated standard deviation,
% and true standard deviation based on the parameters mu and sigma.
numReplicates = 1000; numSteps = 100;
y0 = 1;
t1 = 1;
dt = t1 / numSteps;
y1 = y0*prod(1 + mu*dt + sigma*sqrt(dt)*randn(numSteps,numReplicates));
m = mean(y1); s = std(y1);

% Theoretical values
mtrue = y0 * exp(mu*t1); strue = mtrue * sqrt(exp(sigma^2*t1) - 1);
end

gbmSim accepts two inputs, mu and sigma, and returns four outputs, m, mtrue, s, and strue.

Define the table, params, containing the parameters to input to the Brownian Motion Model.

mu = [-.5; -.25; 0; .25; .5];
sigma = [.1; .2; .3; .2; .1];

params = table(mu,sigma)
params =

5x2 table

mu      sigma
_____    _____

-0.5     0.1
-0.25     0.2
0     0.3
0.25     0.2
0.5     0.1

Apply the function, gbmSim, to the rows of the table, params.

stats = rowfun(@gbmSim,params,...
'OutputVariableNames',...
{'simulatedMean' 'trueMean' 'simulatedStd' 'trueStd'})
stats =

5x4 table

simulatedMean    trueMean    simulatedStd    trueStd
_____________    ________    ____________    ________

0.60501       0.60653       0.05808       0.060805
0.77916        0.7788         0.161        0.15733
1.0024             1        0.3048        0.30688
1.2795         1.284       0.25851        0.25939
1.6498        1.6487       0.16285        0.16529

The four variable names specified by the 'OutputVariableNames' name-value pair argument indicate that rowfun should obtain four outputs from gbmSim. You can specify fewer output variable names to return fewer outputs from gbmSim.

Append the function output, stats, to the input, params.

[params stats]
ans =

5x6 table

mu      sigma    simulatedMean    trueMean    simulatedStd    trueStd
_____    _____    _____________    ________    ____________    ________

-0.5     0.1        0.60501       0.60653       0.05808       0.060805
-0.25     0.2        0.77916        0.7788         0.161        0.15733
0     0.3         1.0024             1        0.3048        0.30688
0.25     0.2         1.2795         1.284       0.25851        0.25939
0.5     0.1         1.6498        1.6487       0.16285        0.16529

Create a table, A, where g is a grouping variable.

g = gallery('integerdata',3,[15,1],1);
x = gallery('uniformdata',[15,1],9);
y = gallery('uniformdata',[15,1],2);

A = table(g,x,y)
A=15×3 table
g       x          y
_    _______    ________

3    0.24756     0.87516
3     0.4358      0.3179
3    0.97755     0.27323
2    0.85995      0.6765
3    0.30063    0.071171
2    0.26589     0.19659
3    0.13338     0.52908
2     0.7425     0.17176
1    0.85692     0.86996
2    0.24286     0.24369
3    0.19492     0.84291
2    0.39076     0.55766
1    0.29683     0.35681
1    0.39031      0.2324
2    0.18726      0.6476

Define the anonymous function, func, to compute the average difference between x and y.

func = @(x,y) mean(x-y);

Find the average difference between variables in groups 1, 2, and 3 defined by the grouping variable, g.

B = rowfun(func,A,...
'GroupingVariable','g',...
'OutputVariableName','MeanDiff')
B=3×3 table
g    GroupCount    MeanDiff
_    __________    ________

1        3         0.028298
2        6         0.032569
3        6         -0.10327

The variable GroupCount indicates the number of rows in A for each group.

Input Arguments

collapse all

Function, specified as a function handle. You can define the function in a file or as an anonymous function. If func corresponds to more than one function file (that is, if func represents a set of overloaded functions), MATLAB® determines which function to call based on the class of the input arguments.

func can accept no more than size(A,2) inputs. By default, rowfun returns the first output of func. To return more than one output from func, use the 'NumOutputs' or 'OutputVariableNames' name-value pair arguments.

Example: func = @(x,y) x.^2+y.^2; takes two inputs and finds the sum of the squares.

Input table, specified as a table or a timetable.

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: 'InputVariables',2 uses only the second variable in A as an input to func.

Specifiers for selecting variables of A to pass to func, specified as the comma-separated pair consisting of 'InputVariables' and a positive integer, vector of positive integers, character vector, cell array of character vectors, string array, logical vector, or a function handle.

If you specify 'InputVariables' as a function handle, then it must return a logical scalar, and rowfun passes only the variables in A where the function returns 1 (true).

One or more variables in A that define groups of rows, specified as the comma-separated pair consisting of 'GroupingVariables' and a positive integer, vector of positive integers, character vector, cell array of character vectors, string array, or logical vector.

The value of 'GroupingVariables' specifies which table variables are the grouping variables, not their data types. A grouping variable can be numeric, or have data type categorical, calendarDuration, datetime, duration, logical, or string.

Rows in A that have the same grouping variable values belong to the same group. rowfun applies func to each group of rows, rather than separately to each row of A. The output, B, contains one row for each group.

If any grouping variable contains NaNs or missing values (such as NaTs, undefined categorical values, or missing strings), then the corresponding rows do not belong to any group, and are excluded from the output.

Row labels can be grouping variables. You can group on row labels alone, on one or more variables in A, or on row labels and variables together.

• If A is a table, then the labels are row names.

• If A is a timetable, then the labels are row times.

Indicator for calling func with separate inputs, specified as the comma-separated pair consisting of 'SeparateInputs' and either true, false, 1, or 0.

 true func expects separate inputs. rowfun calls func with size(A,2) inputs, one argument for each data variable.This is the default behavior. false func expects one vector containing all inputs. rowfun creates the input vector to func by concatenating the values in each row of A.

Indicator to pass values from cell variables to func, specified as the comma-separated pair consisting of 'ExtractCellContents' and either false, true, 0, or 1.

 true rowfun extracts the contents of a variable in A whose data type is cell and passes the values, rather than the cells, to funcFor grouped computation, the values within each group in a cell variable must allow vertical concatenation. false rowfun passes the cells of a variable in A whose data type is cell to func.This is the default behavior.

Variable names for outputs of func, specified as the comma-separated pair consisting of 'OutputVariableNames' and a character vector, cell array of character vectors, or string array, with names that are nonempty and distinct. The number of names must equal the number of outputs desired from func.

Furthermore, the variable names must be valid MATLAB identifiers. If valid MATLAB identifiers are not available for use as variable names, MATLAB uses a cell array of N character vectors of the form {'Var1' ... 'VarN'} where N is the number of variables. You can determine valid MATLAB variable names using the function isvarname.

Number of outputs from func, specified as the comma-separated pair consisting of 'NumOutputs' and 0 or a positive integer. The integer must be less than or equal to the possible number of outputs from func.

Example: 'NumOutputs',2 causes rowfun to call func with two outputs.

Format of B, specified as the comma-separated pair consisting of 'OutputFormat' and either the value 'table', 'uniform', or 'cell'.

 'table' rowfun returns a table with one variable for each output of func. For grouped computation, B, also contains the grouping variables.'table' allows you to use a function that returns values of different sizes or data types. However, for ungrouped computation, all of the outputs from func must have one row each time it is called. For grouped computation, all of the outputs from func must have the same number of rows.This is the default output format. 'timetable' rowfun returns a timetable with one variable for each variable in A (or each variable specified with 'InputVariables'). For grouped computation, B also contains the grouping variables.rowfun creates the row times of B from the row times of A. If the row times assigned to B do not make sense in the context of the calculations performed using func, then specify the output format as 'OutputFormat','table'.If A is a timetable, then this is the default output format. 'uniform' rowfun concatenates the values returned by func into a vector. All of the outputs from func must be scalars with the same data type. 'cell' rowfun returns B as a cell array. 'cell' allows you to use a function that returns values of different sizes or data types.

Function to call if func fails, specified as the comma-separated pair consisting of 'ErrorHandler' and a function handle. Define this function so that it rethrows the error or returns valid outputs for function func.

MATLAB calls the specified error-handling function with two input arguments:

• A structure with these fields:

 identifier Error identifier. message Error message text. index Row or group index at which the error occurred.
• The set of input arguments to function func at the time of the error.

For example,

function [A, B] = errorFunc(S, varargin)
warning(S.identifier, S.message);
A = NaN; B = NaN;

Output Arguments

collapse all

Output table, returned as a table or a timetable. B can store metadata such as descriptions, variable units, variable names, and row names. For more information, see the Properties sections of table or timetable.