Add and Delete Variables
This example shows how to add and delete variables in a dataset array. You can also edit dataset arrays using the Variables editor.
Load sample data.
Import the data from the first worksheet in
hospitalSmall.xlsx
into a dataset array.
ds = dataset('XLSFile',fullfile(matlabroot,'help/toolbox/stats/examples','hospitalSmall.xlsx')); size(ds)
ans = 14 6
The dataset array, ds
, has 14 observations (rows) and 6
variables (columns).
Add variables by concatenating dataset arrays.
The worksheet Heights
in
hospitalSmall.xlsx
has heights for the patients on the
first worksheet. Concatenate the data in this spreadsheet with
ds
.
ds2 = dataset('XLSFile',fullfile(matlabroot,'help/toolbox/stats/examples','hospitalSmall.xlsx'),'Sheet','Heights'); ds = [ds ds2]; size(ds)
ans = 14 7
The dataset array now has seven variables. You can only horizontally concatenate dataset arrays with observations in the same position, or with the same observation names.
ds.Properties.VarNames{end}
ans = hgt
The name of the last variable in ds
is
hgt
, which dataset
read from the first
row of the imported spreadsheet.
Delete variables by variable name.
First, specify the unique identifiers in the variable id
as
observation names. Then, delete the variable id
from the
dataset array.
ds.Properties.ObsNames = ds.id; ds.id = []; size(ds)
ans = 14 6
The dataset array now has six variables. List the variable names.
ds.Properties.VarNames(:)
ans = 'name' 'sex' 'age' 'wgt' 'smoke' 'hgt'
There is no longer a variable called id
.
Add a new variable by name.
Add a new variable, bmi
—which contains
the body mass index (BMI) for each patient—to the dataset array.
BMI is a function of height and weight. Display the last name, gender,
and BMI for each patient.
ds.bmi = ds.wgt*703./ds.hgt.^2; ds(:,{'name','sex','bmi'})
ans = name sex bmi YPL-320 'SMITH' 'm' 24.544 GLI-532 'JOHNSON' 'm' 24.068 PNI-258 'WILLIAMS' 'f' 23.958 MIJ-579 'JONES' 'f' 25.127 XLK-030 'BROWN' 'f' 21.078 TFP-518 'DAVIS' 'f' 27.729 LPD-746 'MILLER' 'f' 26.828 ATA-945 'WILSON' 'm' 24.41 VNL-702 'MOORE' 'm' 27.822 LQW-768 'TAYLOR' 'f' 22.655 QFY-472 'ANDERSON' 'f' 23.409 UJG-627 'THOMAS' 'f' 25.883 XUE-826 'JACKSON' 'm' 24.265 TRW-072 'WHITE' 'm' 29.827
The operators ./
and .^
in
the calculation of BMI indicate element-wise division and exponentiation,
respectively.
Delete variables by variable number.
Delete the variable wgt
, the fourth variable
in the dataset array.
ds(:,4) = []; ds.Properties.VarNames(:)
ans = 'name' 'sex' 'age' 'smoke' 'hgt' 'bmi'
The variable wgt
is deleted from the dataset
array.
See Also
Related Examples
- Add and Delete Observations
- Merge Dataset Arrays
- Calculations on Dataset Arrays
- Dataset Arrays in the Variables Editor
- Index and Search Dataset Arrays