Bootstrap with observations not being rows

2 visualizaciones (últimos 30 días)
Fernando
Fernando el 12 de Ag. de 2014
Comentada: Fernando el 14 de Ag. de 2014
Hi,
I'm doing an optimization routine and now I need to use the bootstrap to compute the standard errors. However, in my data a row is not an observation. Observations correspond to several rows and not all observations have the same number of rows. For example,
ind_obs=[1 1 1 2 2 2 3 3]';
data=[1 2; 3 4; 4 0; 5 6; 2 3; 9 6; 3 4; 7 9];
where ind_obs identifies an observation. In practice, each of the rows with the same observation identifier corresponds to an alternative in a choice set. In this example the number of observations is three.
What I need to do is to generate bootstrap samples with three observations drawn with replacement, regardless of the length of the resulting data matrix. This is, in the original dataset the X matrix is 8x2 while in a bootstrap sample in which only the third observation was sampled three times, the resulting X matrix would be of 6x2.
The objective is to use the bootstrap to generate those samples, call the optimization routine and minimize the associated objective function for each of the samples saving the parameters that minimize that function for each sample. Using the bootstrap function as follows
opt_parameters=bootstrp(10,@opt_function,data);
doesn't work as it selects observations according to the rows (so it generates samples of 8x2) rather than keeping the number of real observations. In the example, opt_function is the function that has the optimization routine.
Thanks,

Respuesta aceptada

Christopher Berry
Christopher Berry el 13 de Ag. de 2014
Fernando,
The function bootstrp will take in a column vector of cells as well as a standard vector/matrix of doubles. Assuming your 3 observations are:
obs1 = [1 2; 3 4; 4 0];
obs2 = [5 6; 2 3; 9 6];
obs3 = [3 4; 7 9];
You can create a cell array to pass bootstrp like this:
data = {obs1; obs2; obs3};
Now, your data is of size = 3x1 so bootstrap will pass the correct number of samples with replacement (3 rather than 8) into the opt_function.
opt_parameters=bootstrp(10,@opt_function,data);
The only thing you will need to do is make sure opt_function handles its input as cell arrays (ie use curly brackets {} for indexing the cell array to get to the matrices). If you are writing your own opt_function, then this is straightforward. However, if you are using a built in function that is expecting the data in matrix or vector form, you will need to create a wrapper function to perform the necessary data conversion.
Lastly, make sure that opt_function returns the same number data elements in each call, since bootstrap expects this when forming its own output.

Más respuestas (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by