Create Unique Interaction Variables

Efficiently adds N-way, custom-named, *unique* interaction variables to your dataset.
611 Descargas
Actualizado 4 abr 2012

Ver licencia

function data_set = create_interaction_variables(data_set,vars,range_nways,separator,max_varname_length)
% create_interaction_variables
%
% for Matlab R13+
% version 1.1 (April 2012)
% (c) Brian Weidenbaum
% website: http://www.BrianWeidenbaum.com/.
%
%
% OUTPUT: your dataset (or a dataset based on your matrix), updated with new, aptly-named *unique* interaction variables,
% ranging from at least 2 to any number the user specifies
%
% INPUTS (** = OPTIONAL)
% input name: (input datatype/s) -- description
%
% data_set: (dataset OR matrix) -- the data you want to alter
%
% **vars: (cell array of chars/numbers, OR vector of numbers, OR 'ALL') --
% default: 'ALL'
% the names of the variables you want to interact; alternatively, the column numbers of the variables you want to interact
% OR, you can just say 'ALL' to include all variables automatically
%
% **range_nways (vector OR 'MAX') --
% default: 2
% the range of numbers of variables to include in the interaction terms generated by this function
% alternatively, just type 'MAX' to use 2 variables to the maximum possible number of vars
%
% **separator (char) --
% default: '_'
% the separator string you want to use to divide the
% variable names that contributed to a new interaction variable. Default
% is '_'. E.g., by default, an interaction between Var1 and Var2 will be
% named 'Var1_Var2'.
%
% **max_varname_length (number) --
% default: 63
% the maximum length of the newly created interaction terms' variable names.
% Any dynamically generated variable names (e.g. 'Var1_Var2') that exceed this number will be excluded from the new dataset.
% You should set max_varname_length according to the database you plan to use with your data--
% e.g., if you only want to use this data in MATLAB, you should set max_varname_length=63 (the maximum length supported by the dataset class,
% but if you plan to export your data to Oracle, you should set it to around 30
%
%
%
% EXAMPLES
%
% You have a dataset with 3 variables: a, b, and c.
% You want to create interaction terms, up to 3 ways, for all of your variables.
% You type:
% new_dataset = create_interaction_variables(old_dataset,'all','max');
% new_dataset will contain:
% a.*b, named 'a_b'
% a.*c, named 'a_c'
% b.*c, named 'b_c'
% and a.*b.*c, named 'a_b_c',
% plus all original variables.
% It will NOT contain b.*a, c.*a, etc because these are not unique combos.
%
% You have a dataset with 3 variables: a, b, and c.
% You want to create interaction terms, up to 2 ways, only for columns 2 and 3.
% You type:
% new_dataset = create_interaction_variables(old_dataset,[2 3],2);
% new_dataset will contain only b.*c,
% plus all original variables.
%
% You have a dataset with 3 variables: a, b, and c.
% You want to create interaction terms, up to 2 ways, only for 'a' and 'c'.
% You type:
% new_dataset = create_interaction_variables(old_dataset,{'a','c'},2);
% new_dataset will contain only a.*c,
% plus all original variables.
%
% You have a dataset with 3 variables: a, b, and c.
% You want to create interaction terms, ONLY 3 ways, only for all vars.
% You type:
% new_dataset = create_interaction_variables(old_dataset,'all',3);
% new_dataset will contain only a .* b .* c,
% plus all original variables.
%

Citar como

Brian Weidenbaum (2025). Create Unique Interaction Variables (https://www.mathworks.com/matlabcentral/fileexchange/35981-create-unique-interaction-variables), MATLAB Central File Exchange. Recuperado .

Compatibilidad con la versión de MATLAB
Se creó con R2011b
Compatible con cualquier versión
Compatibilidad con las plataformas
Windows macOS Linux
Categorías
Más información sobre Tables en Help Center y MATLAB Answers.
Agradecimientos

Inspirado por: allcomb(varargin)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!
Versión Publicado Notas de la versión
1.2.0.0

Changes between 1.0 and 1.1:
-added 'separator' parameter
-changed 'max_nways' parameter to 'range_nways'
-enforced maximum max_varname_length -reduced min number of arguments to 1
-minor performance tweaks

1.0.0.0