Split table from database into sub-tables dependent from timestamp difference

Hi there,
I'm importing measurement data from a PostgreSQL database into one Matlab table, called data:
%% Make connection to database
conn = database('PostgreSQL','NameOfDatabase','Password');
%Set query to execute on the database
query = ['SELECT * ' ...
'FROM postgres.public.data'];
%% Execute query and fetch results
data = fetch(conn,query);
the data table looks like this:
timestamp ID x y z
'2019-07-22 15:46:24.919' 41034 32.5437012000000 6.59714985000000 0
'2019-07-22 15:46:25.019' 41034 32.5466003000000 6.59952021000000 0
'2019-07-22 15:46:25.119' 41034 32.5657997000000 6.60097980000000 0
'2019-07-22 15:46:25.219' 41034 32.5840988000000 6.59719992000000 0
'2019-07-22 15:46:25.319' 41034 32.5975990000000 6.59470987000000 0
'2019-07-22 15:46:25.419' 41034 32.6068001000000 6.59368992000000 0
'2019-07-22 15:46:25.519' 41034 32.6091003000000 6.59696007000000 0
'2019-07-22 15:46:25.619' 41034 32.6016006000000 6.59864998000000 0
'2019-07-22 15:46:25.719' 41034 32.5957985000000 6.59865999000000 0
Now I want to create a new table for each measurement. I can only distinguish the next measurement from a time difference (of a few seconds) in the timestamp column. This is somehow what I'm trying:
format long
data.timestamp = categorical(data.timestamp);
Difference = caldiff(data.timestamp);
Tablename = ('Mess' + 'Counter');
Counter = 0;
for each Difference > 2 seconds
% create new table (Tablename)
% insertrows from start to difference row
Counter = Counter + 1;
end
Is caldiff the right funktion for the very small time differences?
Thanks

1 comentario

"Now I want to create a new table for each measurement"
Why?
MATLAB tables are designed to make it easy to group data and apply operations on those groups:
In contrast, splitting up data like that usually just makes it harder to work with.
What is your actual goal? How do you wish to process your data?

Iniciar sesión para comentar.

 Respuesta aceptada

Jon
Jon el 24 de Jul. de 2019
Editada: Jon el 24 de Jul. de 2019
First I would recommend making a new column in your table with the time expressed as a MATLAB datetime array, for example you could use the following, (see https://www.mathworks.com/help/matlab/matlab_prog/set-display-format-of-date-and-time-arrays.html#buhb8v6-1 for datetime format conversion)
data.time = datetime(data.timestamp,'InputFormat','yyyy-MM-dd HH:mm:ss.SSS')
Then following up on Stephens comment, you really don't want to split this up into lots of small tables, just add the information you want to this one. So for example you could assign a column called measurement which assigns values 1,2,3 ... according to the time range that the measurements occur.
So for example
% calculate elapsed time in seconds from start of test
% (assumes you have already added the time column as a datetime array
et = seconds(data.time - data.time(1));
% make new column with measurement number, with a new measurement starting every 2 seconds
data.measurement = ceil(et/2)
Now if you want to retrieve some specific measurements, for example the x values for measurement number 3 can just use
x = data.x(data.measurement==3)

14 comentarios

Hi Jon, thanks for your help.
Stephen and you are right, I just need to add the measurement number and then work with groups.
So I added the time column:
% new time column
data.time = datetime(data.timestamp,'Inputformat','yyyy-MM-dd HH:mm:ss.SSSS');
% to display fractional seconds
data.time.Format = 'MMM dd, yyyy HH:mm:ss.SSS'
The elapsed time always refers to the first entry. What I try is to check the elapsed time between each row. So if I start measuring the next Minute or Hour or Day, the time brake is bigger than 2 seconds (or even bigger than a Minute or Hour or Day). So I better work with a diff funktion:
% elapsed time
et = seconds(diff(data.time));
For this part:
% new column with measurement number, with a new measurement starting AFTER a break of 2 seconds in time
data.measurement = ceil(et > 2);
I get an Error message:
To assign to or create a variable in a table, the number of rows must match the height of the table.
The elapsed time double has one row less than the table, maybe because of the diff function. Do I need to add here one starting row in the beginning?
Jon
Jon el 24 de Jul. de 2019
Editada: Jon el 24 de Jul. de 2019
OK, for what you want to do you need to count the jumps that are greater than your threshold (e.g. 2 sec).
So you can do something like
% determine where jumps (new measurements start)
% note we assume a new measurement at the start of the data set
isJump = [true; diff(et)>2]
% assign measurement numbers by counting jumps
data.measurement = cumsum(isJump)
Note that I use the "trick" here that the elements of logical arrays, like isJump above, are zeros and ones and MATLAB lets you add them up as numbers
I get the same error:
To assign to or create a variable in a table, the number of rows must match the height of the table.
My 'et' and 'isJump' is one row short?
The length of the diff of a vector is one less than the length of the original vector. So if you just use diff(et) you will have a problem.
Are you sure you padded your isJump with an initial value of true as I did?
isJump = [true; diff(et)>2]
Yes, I used exactly your code. Maybe I better not use diff?
Can you please attach your code, maybe I can see what is going on.
clear all;
%% Make connection to database
conn = database('PostgreSQL','name','password');
%Set query to execute on the database
query = ['SELECT * ' ...
'FROM postgres.public.data'];
%% Execute query and fetch results
data = fetch(conn,query);
%% Try
% New time
data.time = datetime(data.timestamp,'Inputformat','yyyy-MM-dd HH:mm:ss.SSSS');
data.time.Format = 'MMM dd, yyyy HH:mm:ss.SSS';
% calculate elapsed time in seconds from start of test
et = seconds(diff(data.time));
% make new column with measurement number, with a new measurement starting every 2 seconds
isJump = [true; diff(et)>2];
data.measurement = cumsum(isJump);
%% Close connection to database
close(conn)
%% Clear variables
clear conn query
Sorry, there was a little confusion on the definition of the variable et as the discussion evolved. Originally I had it as the duration from the very start of the test (first row) It then changed to the duration between rows.
As a result of this confusion you calculated diff twice, so just
change the line
isJump = [true; diff(et)>2];
to
isJump = [true; et>2];
and it should work
Thanks a lot Jon, it works!
Excellent! Glad it worked out.
You may just not have gotten to it, but if this worked for you it would be good to accept the answer so other people who may have the same problem will know to look here.
Hi, I have a ongoing question:
I grouped the data in the column data.measurement by its value (id = 1,2,3,4,5). Now I try to plot only the x,y data (from column 3,4) with id == 5 ?
[group, id] = findgroups(data.measurement);
x= data{:,3};
y= data{:,4};
plot(x,y) with id==5
I don't think it is necessary to use the findgroups function for this purpose. Assuming you now have your data in a table called data, with columns named time, x, y, and measurement. You can simply use
plot(data.x(data.measurement==5),data.y(data.measurement==5))
or perhaps to make the code a little more readable and efficient (avoid finding the matching columns twice) you could use
% make logical vector whose elements are set to true on rows with the desired
% measurement number
idl = data.measurement==5;
% plot the x and y values corresponding to the desired measurement using logical indexing
plot(data.x(idl),data.y(idl))

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Data Type Identification en Centro de ayuda y File Exchange.

Productos

Versión

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by