readtable with datetime, format problem
Mostrar comentarios más antiguos
I'm reading in an excel file that contains the date/time stamp of a recording in one column. The format of the date/time in the original file is day/month/year hour:min:sec PM. The date is sometimes correctly read, but in instances when the day/month is arbitrary (day is <=12), the month and day is swapped
Example:
Recording was made on December 6, 2022 (saved in excel sheet as 6/12/2022 7:56:39 PM)
I run these lines of code:
%%
opts = detectImportOptions(filepath,'Sheet','NRT Peaks');
opts = setvaropts(opts,"RecordingDate_Time",'InputFormat','d/MM/uuuu h:mm:ss aa','DatetimeLocale','de_BE','DatetimeFormat','d/MM/uuuu h:mm:ss aa');
nrtPeaks = readtable(filepath,opts,'Sheet','NRT Peaks');
%%
nrtPeaks.RecordingDate_Time(1) = 12-Jun-2022 19:56:39
How do I fix this?
The code runs correctly in some instances but not others. I played around with readcell and had the same issue.
14 comentarios
Rachel Scheperle
el 22 de Feb. de 2024
"I'm reading in an excel file that contains the date/time stamp of a recording in one column. The format of the date/time in the original file is day/month/year hour:min:sec PM"
Note that Excel (using the contraversial Office Open XML standard, i.e. .XLSX) does not store dates with a format. All dates are stored as serial date numbers (by default the 1st January 1900 is serial number 1). And MATLAB has no problem importing Excel's serial date numbers.
So it really depends what you have in your "Excel Sheet":
- is the cell content really formatted as a date?
- is it really an Excel file or is it really e.g. a .CSV file (which exist quite independently of Excel)?
- ... something else?
We don't know, because you did not upload a sample data file by clicking the paperclip button.
Rachel Scheperle
el 22 de Feb. de 2024
@Rachel Scheperle: thanks for uploading the file. I took a look, and those dates are correctly stored as serial date numbers. For example, here is the first date in cell C2 (line 66):

That is the number of days since 1 Jan 1900, which is very easy to calculate: it is the 12th of June 2022. You can easily check that yourself using Excel, MATLAB, or some online tools.
"...but unfortunately the correct date is supposed to be December 6, 2022."
Then the person or application that created that file really messed up big time. That would mean that all of the dates are incorrectly encoded (most likely because they did not know how to specify MS Excel dates, nor did they check their work). Basically they encoded dates using units which just happen to look like the units that they wanted to have, but were actually not those units. Learn from their mistake! (they obviously didn't)
To fix their mistake you would have to un-swap those unit values somehow.
"If you mean that within a single file some dates are imported correctly"
Actually MATLAB imports the dates 100% correctly.
The problem is (apparently) that the person/application that created that file encoded the dates incorrectly.
Note that the custom format code within Excel is "DD/MM/YYYY HH:MM", which exactly corresponds to those dates being interpreted as 12th of June 2022. So either the dates have been wrongly defined... or the OP is wrong about what they represent. Either is quite possible.
"if you mean that for some files the dates are imported correctly and for other files the dates are not imported correctly"
There is absolutely no way to distinguish from the serial date numbers themselves which ones might have been created incorrectly, and which ones correctly. They are all just numbers.
In such a situation we would have to rely on meta-information.
Voss
el 22 de Feb. de 2024
Yes, I agree. As I said when I said, "Are you sure that the error is not with the software that exports this data?"
Rachel Scheperle
el 22 de Feb. de 2024
"Any thoughts on this?"
If it works, do it.
Note that by importing as text and then using datetime you are essentially un-swapping the units... thus fixing the problem.
"I uploaded another file as an example of one that was importing "correctly" with my original code"
The "dates" in that file are not actually Excel serial date numbers at all, they are simply text. But interestingly they are stored as shared strings within the workbook (rather than individually for each cell):


It is pretty clear that whatever/whoever created those files... does not understand Excel at all.
It also makes it clear, that your approach of importing all of the dates as text is a good workaround/approach to take.
Cris LaPierre
el 22 de Feb. de 2024
Can you change the format of one of your dates to General and confirm that the Excel datenum starts with 44901?
Cris LaPierre
el 22 de Feb. de 2024
Editada: Cris LaPierre
el 22 de Feb. de 2024
About the 6m file, I wanted to confirm that the dates got stored as text, likely because the first number can't be a month. If I change the day to be something less than 13, Excel immediately formats it as a date using month/day/year.
As for reading in as text, I see that when the date is an Excel datenum, the month gets formatted to MMM automatically. This is probably to avoid confusion.
opts = detectImportOptions("P50-DE-test-2w...-IPG7-e3.xlsx",'Sheet','NRT Peaks');
opts = setvartype(opts,"RecordingDate_Time",'char');
T1 = readtable("P50-DE-test-2w...-IPG7-e3.xlsx",opts)
T2 = readtable("P50-DE-test-2w...-IPG7-e3.xlsx",'Sheet','NRT Peaks','DatetimeType','text','TextType','string')
T3 = readtable("P50-DE-test-6m...-IPG7-e3.xlsx",'Sheet','NRT Peaks','DatetimeType','text','TextType','string')
Rachel Scheperle
el 22 de Feb. de 2024
"Can you change the format of one of your dates to General and confirm that the Excel datenum starts with 44901?"
Why would it? I already showed the value of what is stored inside the XLSX file. And I explained why.
Here it is again, formatting that date column as number (to show the serial date number):

So I have shown with two different methods what serial date number is stored in the XLSX file and they exactly agree with each other and what the OP showed above. Why do you expect some other value totally unrelated to that which the file actually contains?
Rachel Scheperle
el 22 de Feb. de 2024
Cris LaPierre
el 22 de Feb. de 2024
@Stephen23, I was trying to convince myself it wasn't due to different regional settings (see here). However, I've since convinced myself that, once it gets converted to a datenum, it doesn't change, even if opened on a computer with different regional settings. Just had to think through it.
Respuesta aceptada
Más respuestas (1)
Cris LaPierre
el 22 de Feb. de 2024
Editada: Cris LaPierre
el 22 de Feb. de 2024
The 'InputFormat' argument is only useful for date strings. Excel applies a display format to dates and times, so what you see is not actually how the data gets imported into MATLAB. In Excel, change the format of your date cells to General to view how the data is actually stored. That date and time should appear as 44901,8310069444
You might want to try setting the Datetime Type to exceldatenum.
opts = detectImportOptions(filepath,'Sheet','NRT Peaks');
opts = setvaropts(opts,"RecordingDate_Time","DatetimeType","exceldatenum",'DatetimeLocale','de_BE','DatetimeFormat','d/MM/uuuu h:mm:ss aa');
nrtPeaks = readtable(filepath,opts,'Sheet','NRT Peaks');
Note that the DatetimeLocale value determines how input text values are interpreted. Since the date is likely imported as a datenum, this setting is not doing anything and can probably be removed without affecting your output.
5 comentarios
Rachel Scheperle
el 22 de Feb. de 2024
@Rachel Scheperle: Here's a modification to Cris's answer that avoids the error and does what I think was intended. However, the dates are still June 12th instead of December 6th.
filepath = 'P50-DE-test-2w-AGF-IPG7-e3.xlsx';
opts = detectImportOptions(filepath,'Sheet','NRT Peaks','DatetimeType','exceldatenum');
opts = setvaropts(opts,"RecordingDate_Time",'Type','datetime','DatetimeLocale','de_BE','DatetimeFormat','d/MM/uuuu h:mm:ss aa');
T = readtable(filepath,opts,'Sheet','NRT Peaks')
month(T.RecordingDate_Time)
Are you sure that the error is not with the software that exports this data? You can see that the RecordingDate_Time data stored in the file really does correspond to June 12th, 2022, if it is 'excel' time type (but maybe it is another type):
opts = detectImportOptions(filepath,'Sheet','NRT Peaks','DatetimeType','exceldatenum');
T = readtable(filepath,opts,'Sheet','NRT Peaks')
datetime(T.RecordingDate_Time,'ConvertFrom','excel')
Cris LaPierre
el 22 de Feb. de 2024
Editada: Cris LaPierre
el 22 de Feb. de 2024
First, are you sure the format is set correctly? It may be due to the settings of my Excel, but when I open your spreadsheet and check the cell format, it is set to m/d/yyyy:

About quotes, when in doublt, use single quotes. They always work.
Sorry about the error. Without your file, I didn't have a quick way to check. While the 'DatetimeType' is a Name,Value pair for readtable, it does not seem to be available when using detectImportOptions. However, it does not appear to do what I was hoping anyway.
Cris LaPierre
el 22 de Feb. de 2024
Editada: Cris LaPierre
el 22 de Feb. de 2024
If I modify the regional settings of the dates on that sheet so that dates are day/month, then it imports correctly. For me, that also meant re-entering the dates, since the Excel datenum was already recorded as Jun 12, 2022.
I modified the display format so that it is obvious the month and day are read in correctly.
opts = detectImportOptions("P50-DE-test-2w-AGF-IPG7-e3_edited.xlsx",'Sheet','NRT Peaks');
opts = setvaropts(opts,"RecordingDate_Time","DatetimeFormat","d/MMM/uuuu h:mm:ss aa");
T = readtable("P50-DE-test-2w-AGF-IPG7-e3_edited.xlsx",opts)
Rachel Scheperle
el 22 de Feb. de 2024
Categorías
Más información sobre Spreadsheets en Centro de ayuda y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!