Main Content

stack

Stack data from input table or timetable into one variable in output table or timetable

Description

example

S = stack(U,vars) converts the input table or timetable, U, into the stacked table or timetable, S. The function stacks values from multiple variables in U into one variable in S. The input argument vars specifies which variables from U to stack, either by their names or their positions in U.

If U has n rows and vars specifies m variables in U, then S has m*n rows. The stack function interleaves values from the specified variables in U to create one variable in S. For example, if U has 10 rows and you stack three of the variables from U, then S has 30 rows. In general, S contains fewer variables, but more rows, than U.

The output table or timetable, S, contains a new categorical variable to indicate which variable in U the stacked data in each row of S comes from. The stack function replicates data from the variables in U that are not stacked.

  • If U is a table, then you cannot stack row names.

  • If U is a timetable, then you cannot stack row times.

S = stack(U,{vars1,...,varsN}) stacks values from the variables specified by {vars1,...,varsN} into N variables in S. For more information, see Create Multiple Stacked Variables in Output.

example

S = stack(___,Name,Value) converts the table, U, with additional options specified by one or more name-value arguments.

For example, you can specify variable names for the new and stacked variables in S.

example

[S,iu] = stack(___) also returns an index vector, iu, indicating the correspondence between rows in S and rows in U. You can use any of the previous input arguments.

Examples

collapse all

Create a table containing test scores from three separate tests. The table is in unstacked format.

Test1 = [93;57;87;89];
Test2 = [89;77;92;86];
Test3 = [95;62;89;91];

U = table(Test1,Test2,Test3)
U=4×3 table
    Test1    Test2    Test3
    _____    _____    _____

     93       89       95  
     57       77       62  
     87       92       89  
     89       86       91  

The table contains four rows and three variables.

Stack the test scores into a single variable.

S = stack(U,1:3)
S=12×2 table
    Test1_Test2_Test3_Indicator    Test1_Test2_Test3
    ___________________________    _________________

               Test1                      93        
               Test2                      89        
               Test3                      95        
               Test1                      57        
               Test2                      77        
               Test3                      62        
               Test1                      87        
               Test2                      92        
               Test3                      89        
               Test1                      89        
               Test2                      86        
               Test3                      91        

S contains twelve rows and two variables. S is in stacked format.

The categorical variable, Test1_Test2_Test3_Indicator, identifies which test corresponds to the score in the stacked data variable, Test1_Test2_Test3.

Create a timetable indicating the amount of snowfall in three towns from five different storms. Specify the dates of the storms as datetime values and use them as the row times of the timetable U. Specify the array of storm numbers, Storm, as a categorical array since there is a fixed set of storm numbers in this timetable.

Storm = categorical([1;2;3;4;5]);
Date = datetime({'2011-12-25';'2012-01-02';'2012-01-23';'2012-02-07';'2012-02-15'});
Natick = [20;5;13;0;17];
Boston = [18;9;21;5;12];
Worcester = [26;10;16;3;15];

U = timetable(Date,Storm,Natick,Boston,Worcester)
U=5×4 timetable
       Date        Storm    Natick    Boston    Worcester
    ___________    _____    ______    ______    _________

    25-Dec-2011      1        20        18         26    
    02-Jan-2012      2         5         9         10    
    23-Jan-2012      3        13        21         16    
    07-Feb-2012      4         0         5          3    
    15-Feb-2012      5        17        12         15    

The variables Storm and Date contain data that is constant at each location.

Stack the variables Natick, Boston, and Worcester into a single variable. Name the variable containing the stacked data, Snowfall, and name the new indicator variable, Town.

S = stack(U,{'Natick','Boston','Worcester'},...
          'NewDataVariableName','Snowfall',...
          'IndexVariableName','Town')
S=15×3 timetable
       Date        Storm      Town       Snowfall
    ___________    _____    _________    ________

    25-Dec-2011      1      Natick          20   
    25-Dec-2011      1      Boston          18   
    25-Dec-2011      1      Worcester       26   
    02-Jan-2012      2      Natick           5   
    02-Jan-2012      2      Boston           9   
    02-Jan-2012      2      Worcester       10   
    23-Jan-2012      3      Natick          13   
    23-Jan-2012      3      Boston          21   
    23-Jan-2012      3      Worcester       16   
    07-Feb-2012      4      Natick           0   
    07-Feb-2012      4      Boston           5   
    07-Feb-2012      4      Worcester        3   
    15-Feb-2012      5      Natick          17   
    15-Feb-2012      5      Boston          12   
    15-Feb-2012      5      Worcester       15   

S contains three rows for each storm, and stack repeats the data in the constant variables, Storm and Date, accordingly.

The categorical variable, Town, identifies which variable in U contains the corresponding Snowfall data.

Create a timetable containing estimated influenza rates per month along the east coast of the United States. Create a different variable for the Northeast, Mid Atlantic, and South Atlantic. Data Source: the Google Flu Trends project (since discontinued). Use a datetime array as row times of the timetable.

Month = datetime(2005,10,1,'Format','MMMM yyyy') + calmonths(0:5);
Month = Month';
NE = [1.1902; 1.3610; 1.5003; 1.7772; 2.1350; 2.2345];
MidAtl = [1.1865; 1.4120; 1.6043; 1.8830; 2.1227; 1.9920];
SAtl = [1.2730; 1.5820; 1.8625; 1.9540; 2.4803; 2.0203];

fluU = timetable(Month,NE,MidAtl,SAtl)
fluU=6×3 timetable
        Month          NE      MidAtl     SAtl 
    _____________    ______    ______    ______

    October 2005     1.1902    1.1865     1.273
    November 2005     1.361     1.412     1.582
    December 2005    1.5003    1.6043    1.8625
    January 2006     1.7772     1.883     1.954
    February 2006     2.135    2.1227    2.4803
    March 2006       2.2345     1.992    2.0203

The variable Month contains data that is constant across the row.

Stack the variables NE, MidAtl, and SAtl into a single variable called FluRate. Name the new indicator variable Region and output an index vector, ifluU, to indicate the correspondence between rows in the input unstacked table, fluU, and the output stacked table, fluS.

[fluS,ifluU] = stack(fluU,1:3,...
                     'NewDataVariableName','FluRate',...
                     'IndexVariableName','Region')
fluS=18×2 timetable
        Month        Region    FluRate
    _____________    ______    _______

    October 2005     NE        1.1902 
    October 2005     MidAtl    1.1865 
    October 2005     SAtl       1.273 
    November 2005    NE         1.361 
    November 2005    MidAtl     1.412 
    November 2005    SAtl       1.582 
    December 2005    NE        1.5003 
    December 2005    MidAtl    1.6043 
    December 2005    SAtl      1.8625 
    January 2006     NE        1.7772 
    January 2006     MidAtl     1.883 
    January 2006     SAtl       1.954 
    February 2006    NE         2.135 
    February 2006    MidAtl    2.1227 
    February 2006    SAtl      2.4803 
    March 2006       NE        2.2345 
      ⋮

ifluU = 18×1

     1
     1
     1
     2
     2
     2
     3
     3
     3
     4
      ⋮

ifluU(5) is 2. The fifth row in the output table, fluS, contains data from the second row in the input table fluU.

Input Arguments

collapse all

Input table, specified as a table or a timetable.

Variables in U to stack, specified as a positive integer, vector of positive integers, string array, character vector, cell array of character vectors, or pattern scalar.

Example: S = stack(U,1:4) stacks the first four variables of U into one variable in S.

Example: S = stack(U,{'Var1',Var3','Var5'}) stacks the variables of U that are named Var1, Var3, and Var5.

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: 'NewDataVariableName','StackedData' names the new data variable StackedData.

Variables other than vars to include in the output, specified as the comma-separated pair consisting of 'ConstantVariables' and a positive integer, vector of positive integers, string array, character vector, cell array of character vectors, pattern scalar, or logical vector. stack replicates the data from the constant variables for each stacked entry from a row.

The default is all the variables in U not specified by vars. You can specify the 'ConstantVariables' name-value pair argument to exclude variables not specified by vars or 'ConstantVariables' from the output table, S.

U can have row labels along its first dimension. If U is a table, then it can have row names as the labels. If U is a timetable, then it must have row times as the labels.

  • You can include the row names or row times when you specify the value of 'ConstantVariables'.

  • stack replicates the row names or row times even when you do not include them in 'ConstantVariables'.

Name for the new data variable in S, specified as the comma-separated pair consisting of 'NewDataVariableName' and a character vector, string array, or cell array of character vectors. The default is a concatenation of the names of the variables from U that are stacked up.

Name for the new indicator variable in S, specified as the comma-separated pair consisting of 'IndexVariableName' and a character vector or string scalar. The default is a name based on NewDataVariableName.

Output Arguments

collapse all

Stacked table, returned as a table or a timetable. S contains a stacked data variable, a categorical indicator variable, and any constant variables.

You can store additional metadata such as descriptions, variable units, variable names, and row names in S. For more information, see the Properties sections of table or timetable.

stack assigns the variable units and variable description property values from the first variable listed in vars to the corresponding S.Properties.VariableUnits and S.Properties.VariableDescrisciptions values for the new data variable.

Index to U, returned as a column vector. The index vector, iu, identifies the row in the input table, U, containing the corresponding data. stack creates the jth row in the output table, S, using U(iu(j),vars).

Algorithms

collapse all

Create Multiple Stacked Variables in Output

  • It is possible to stack values from the input into multiple variables in the output. To create multiple stacked variables in S, use a cell array to specify multiple groups of variables from U. You can use a cell array to contain multiple values for vars, and a cell array of character vectors or string array to contain multiple values for the 'NewDataVariableName' name-value pair argument. All groups must contain the same number of variables.

    For example, if U contains four variables, then you can create two stacked variables in S, instead of only one, by using this syntax.

    vars = {[1 2],[3 4]};
    S = stack(U,vars)
    

    As a result, S has a stacked variable with values from the first and second variables of U, and another stacked variable with values from the third and fourth variables of U.

    To specify new variable names in S, use 'NewDataVariableName'. The number of names you specify must equal the number of groups specified in vars.

    vars = {[1 2],[3 4]};
    S = stack(U,vars,'NewDataVariableName',{'Vars1And2','Vars3And4'})
    

  • When you specify vars as a cell array of character vectors, then S contains one stacked variable. To specify multiple stacked variables while using variable names, use a cell array of cell arrays of character vectors, or a cell array of string arrays.

    For example, this syntax creates one stacked variable in S when U is a table with variables named Var1, Var2, Var3, and Var4.

    S = stack(U,{'Var1','Var2','Var3','Var4'})
    

    This syntax creates two stacked variables in S.

    S = stack(U,{{'Var1','Var2'} {'Var3','Var4'}})
    

Extended Capabilities

Version History

Introduced in R2013b