How to preserve nanosecond precision in datetime calculations (for large numbers)
139 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Matthew Casiano
el 11 de Nov. de 2021
Editada: James Tursa
hace alrededor de 7 horas
Hi, I am working with datetime variables with nanosecond precision. I am trying to maintain this precision throughout the calculations, but I lose the precision because the precision of the numbers exceed double precision (for large numbers). For example, 1294871257.002060945 seconds has 3 additional digits more than double precision that I need to preserve in the calculations. I realized that I cannot use the vpa function on a datetime variable. Are there any suggestions on how to handle this? The example script below shows the loss of precision in the calculation by outputing the decimal portions of the original and final numbers.
Thanks,
dt=1294871257.002060945; % nanosecond precision for a large number that exceeds double precision.
epoch = datetime(1980,1,6,'TimeZone','UTCLeapSeconds'); % define the epoch reference time in UTC (and has 0 seconds)
DtUTC = epoch + seconds(dt); % add the duration to the reference time
Time_vec=datevec(DtUTC); % converts final datetime value to a 6-element vector (Y,M,D,H,M,S)
sprintf('%.9f',dt-floor(dt)) % decimal part of original number of seconds
sprintf('%.9f',Time_vec(6)-floor(Time_vec(6))) % decimal part of calculated number of seconds
0 comentarios
Respuesta aceptada
Steven Lord
el 11 de Nov. de 2021
Start off with the time as a symbolic object.
dt=sym('1294871257.002060945');
If you don't and just start with dt as a double you've already lost.
eps(double(dt))
fprintf('%0.16f', double(dt))
Now split dt into the integer part and the fractional part symbolically.
frac = dt - floor(dt);
whole = dt - frac;
fprintf('%0.16f', double(frac))
Finally add whole and frac separately to your epoch.
epoch = datetime(1980,1,6,'TimeZone','UTCLeapSeconds');
DtUTC = epoch + seconds(double(whole)) + seconds(double(frac));
DtUTC.Format = 'uuuu-MM-dd''T''HH:mm:ss.SSSSSSSSS''Z'''
Time_vec=datevec(DtUTC);
sprintf('%.9f',dt-floor(dt))
sprintf('%.9f',Time_vec(6)-floor(Time_vec(6)))
13 comentarios
Peter Perkins
el 30 de Nov. de 2021
Editada: James Tursa
el 22 de Nov. de 2024 a las 23:49
Just for the record, this error
The date format for UTCLeapSeconds datetimes must be 'uuuu-MM-dd'T'HH:mm:ss.SSS'Z''
was one of the warts that dpb refers to that was removed in R2021b. Formats for UTCLeapSeconds must still be that ISO form, but may now have 0 to 9 fractional seconds digits.
Más respuestas (2)
dpb
el 11 de Nov. de 2021
>> t='1294871257.002060945'; % treat long value as string
>> dsec=seconds(str2double(extractBefore(t,'.'))) + ...
seconds(str2double(extractAfter(t,strfind(t,'.')-1))); % combine integer/fractional parts
>> dsec.Format='dd:hh:mm:ss.SSSSSSSSS' % format to show nsec resolution
dsec =
duration
14986:22:27:37.002061035
>>
This is the most straightforward workaround I can think of within the limittions of the (somewhat hampered) durations class which has limited input options for formatting input and the helper functions such as seconds that are only base numeric classes aware.
The datetime class is still pretty young; it has much left to be worked out in order to make it fully functional over niche usage such as yours.
You'll probably have to build a wrapper class of functions to hide all the internal machinations required; the above just illustrates that if you can break the whole number of seconds precision required into pieces that are each within the precision of a double that the duration object itself can deal with them to that precision -- at least storing an input number. I've not tested about rounding when try to do arithmetic with the result.
5 comentarios
Steven Lord
el 12 de Nov. de 2021
This:
dt=sym(1294871257.002060945);
evaluates 1294871257.002060945 in double precision and converts the resulting double into symbolic. As I stated in my answer the first of those steps has already caused problems for your approach.
format longg
dt = 1294871257.002060945
% Spacing between dt and the next largest *representable* double
eps(dt)
% This is the closest representable double to the number you entered
fprintf('%0.16f', dt)
This:
dt=sym('1294871257.002060945');
vpa(dt, 20)
doesn't go through double at all.
In some cases sym can "recognize" the floating-point result of an operation that should be of a particular form and compensate for roundoff error. See the description of the flag input argument to the sym function, specifically the row dealing with the (default) 'r' flag, for a list of recognized expressions. So as an example even though 1/3 is not exactly one third, it's close enough for sym. But your number doesn't fall into one of those recognized expression categories.
x = sym(1/3)
James Tursa
el 22 de Nov. de 2024 a las 23:42
Editada: James Tursa
el 22 de Nov. de 2024 a las 23:56
This question already has an accepted answer, but I would like to add my observations on this topic.
datetime:
- Internally holds the value as a complex double. The real part is the number of milliseconds since Modern UTC Epoch = 1970-Jan-01 UTC. The imaginary part is a "correction" to add to maintain precision.
duration:
- Internally holds the value as a real double representing milliseconds.
You can already see the disconnect. The datetime class is designed to hold values to a higher precision than the duration class. Creating datetime variables with the higher precision is fine, but as soon as you subtract them a duration results and you lose that precision. This is an unfortunate situation and could have been avoided if the duration class held the value the same way that datetime variables do. I have made the suggestion to TMW to change this and make the duration class consistent with the datetime class, but I don't know if they will ever do it.
Example of the problem:
format longg
dt = datetime(2000,1,1,1,1,1.2345678912345)
dt.Second
You can see that the full precision of the original seconds is there. You can even look at the internals:
sdt = struct(dt)
What is the data value? Well, here is a demonstration:
mutc = dt - datetime(1970,1,1)
milliseconds(mutc)
You can see that the data value in the dt variable is in fact milliseconds since Modern UTC epoch. To see the purpose of the imaginary part, note that the duration calculated from a datetime difference does not retain the seconds accuracy:
[h,m,s] = hms(mutc)
The trailing digits of the duration seconds does not match the original. Not good.
But the datetime variable, with the correction, is able to get the original seconds accurately. E.g.,
(mod(real(sdt.data),60000) + imag(sdt.data)) / 1000
When calculating the second value, internally the equivalent of the above is done to maintain precision. This is all good as far as it goes, but as soon as you subtract two datetime variables you lose this! Sigh ...
My advice when you need to retain precision is to AVOID THE DURATION CLASS and to AVOID SUBTRACTING DATETIME VARIABLES. You may have to write your own code to subtract datetime variables piecemeal and maintain the difference in your own format. If you have a string of "extended precision" you need to add to a datetime, separate the string into pieces that can individually be held accurately in doubles, and add them to the datetime sequentially to maintain precision. E.g., see this related thread:
5 comentarios
Peter Perkins
hace alrededor de 10 horas
Memory and performance.
If your datetimes are at ms precision, unless you need ms precision over more than 284,000 years, durations are exact. How many people need more than that? I'm guessing noone.
If your datetimes are at higher precision, durations provide us resolution over 284 years, ns resolution over 104 days. There's some round-off involved, and that might affect some calculations that use exact comparisons, but exact numeric comparisons are not a good idea to begin with. How many people need that resolution over those spans? I'm guessing very very few.
So there was a choice between doubling the footprint and (not exactly) halving the performance for everyone, and losing resolution for probably almost noone.
James Tursa
hace alrededor de 8 horas
Editada: James Tursa
hace alrededor de 7 horas
All your points are well taken, but this forum has already seen at least two posts where people needed more precision. What's the big deal with showing them why durations are not appropriate in these cases and also showing them how to do the calculations to maintain precision? I really don't understand the push back here.
If there were only two people on the planet that cared about this issue, me and the original poster, I would be OK with that because I would have helped that person solve their particular problem.
Ver también
Categorías
Más información sobre Data Type Conversion en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!