How to subset data based on time range

3 visualizaciones (últimos 30 días)
012786534
012786534 el 5 de Mzo. de 2020
Comentada: Star Strider el 5 de Mzo. de 2020
Hi,
I am wondering how to subset data based on time range ? For example I would like to subset the maximum number of rows that fall within an 18 hours time range in the table below:
data = {"2017-01-12 09:50:46" "2017-01-15 13:50:46" "2017-01-15 14:50:50" "2017-01-15 17:52:06" "2017-01-15 18:52:22" "2017-01-16 11:52:22" "2017-01-19 09:52:22"}';
t = cell2table(data);
t.data = datetime(t.data, 'InputFormat', 'yyyy-MM-dd HH:mm:ss')
The thing is, I don’t know what is the start point or the end point. All I know is that the correct answer is the most inclusive one (i.e the one with the largest number of rows). In the example here, the correct answer is t.data(2:5) because all the points fall within 18 hours of each other and because it is the answer with the largest number of rows (4). The data is set up in a way where there can only be one correct answer (only one combination has the largest number of rows). I hope I am being clear.
How would I do that ?
Thank you,

Respuesta aceptada

Star Strider
Star Strider el 5 de Mzo. de 2020
Editada: Star Strider el 5 de Mzo. de 2020
I am not certain what result you want.
This computes all values of ‘t.data’ that are within 18 hours of a particular row value:
data = {"2017-01-12 09:50:46" "2017-01-15 13:50:46" "2017-01-15 14:50:50" "2017-01-15 17:52:06" "2017-01-15 18:52:22" "2017-01-19 09:52:22"}';
t = cell2table(data);
t.data = datetime(t.data, 'InputFormat', 'yyyy-MM-dd HH:mm:ss');
h18 = t.data + hours(18);
for k = 1:size(t.data,1)
within18{k,:} = t.data((t.data >= t.data(k)) & (t.data <= h18(k)));
end
For example:
Within18_3 = [within18{3}]
contains:
Within18_3 =
3×1 datetime array
15-Jan-2017 14:50:50
15-Jan-2017 17:52:06
15-Jan-2017 18:52:22
The first row in ‘Within18_3’ are the third row value and the next two rows are those within 18 hours of it.
EDIT — (5 Mar 2020 at 17:36)
If you want ±9 hours instead, this works:
h18 = [t.data + hours(-9), t.data + hours(9)];
for k = 1:size(t.data,1)
within18{k,:} = t.data((t.data >= h18(k,1)) & (t.data <= h18(k,2)));
end
For example:
Within18_3 = [within18{3}]
now contains:
Within18_3 =
4×1 datetime array
15-Jan-2017 13:50:46
15-Jan-2017 14:50:50
15-Jan-2017 17:52:06
15-Jan-2017 18:52:22
  2 comentarios
012786534
012786534 el 5 de Mzo. de 2020
Hi Star Strider,
Indeed, I should have been clearer. My apologies. I have clarified my question above. Thank you.
Star Strider
Star Strider el 5 de Mzo. de 2020
To get the set with the greatest number of rows:
[~,idx] = max(cellfun(@(x)size(x,1), within18))
Out = [within18{idx}]
That works however you want to define it, however it will return only the first set if there are several with the same maximum number of rows.
To return all that have the maximum number of rows:
rowsizes = cellfun(@(x)size(x,1), within18);
maxrows = max(rowsizes)
idx = find(rowsizes == maxrows)
Out = [within18{idx}]

Iniciar sesión para comentar.

Más respuestas (1)

Guillaume
Guillaume el 5 de Mzo. de 2020
It's not too clear what you mean by subsetting.If you want to bin the dates in ranges of 18 hours then use discretize:
discretize(t.data, 'hours(18)') %split into 18 hours bins. Returns the bin indices

Categorías

Más información sobre Line Plots en Help Center y File Exchange.

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by