Removing Rows from Array Based on Date/Time Value in Cell
7 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
Liza Miller
el 29 de Jul. de 2020
Comentada: Liza Miller
el 30 de Jul. de 2020
I am trying to remove all lines of data for whom the "last order date" (row 17 of my CSV file) is after 8/31/2019. My code was weeding out some, but there were a few later dates left behind. I switched all the date values to Serial Date Numbers in hopes that would simplify the process, but it's still not weeding out anything after 7183 (the serial date number for 8/31/19). I am out of ideas and would greatly appreciate any feedback.
Code:
segment = readtable('myFile.csv');
a = 1;
rows = height(segment);
newDate = [];
for f = 1:rows
addDate = datenum(segment{f, 17});
newDate(f) = addDate;
end
cutoffDateOG = datetime('08/31/0019','InputFormat','MM/dd/uuuu', 'Format', 'preserveinput');
cutoffDate = datenum(cutoffDateOG);
while a < rows
if newDate(a)>cutoffDate || isnat(segment{a,17}) == 1
segment(a,:) = [];
rows = rows-1;
end
a = a+1;
end
end
0 comentarios
Respuesta aceptada
Cris LaPierre
el 30 de Jul. de 2020
Ok, I'll take you through the steps I'd follow. I'll be sharing videos from our Exploratory Data Analysis specialization on Coursera. If you have time, you might find it a great way to learn the latest features in MATLAB for data analysis. You can access all but the graded content for free.
You'll have to get the data into MATLAB first. I suggest using the Import Tool. This video may be helpful for that. If you prefer, you can generate the code used to import the data from the Import Tool. This video on generating and reusing code created by the import tool shows you how.
Make sure to set the data type of the variables you want to use (each column of a table is a variable). Most important is to set the Last Order Date to datetime. You can learn more about working with dates and times here.
Then you'll want to access specific variables from the table for your comparison. This video will explain how to do this. And since you are actually wanting to access a subset of the data in your table, perhaps this video will also be helpful.
% Import data setting date formats
opts = detectImportOptions("MatLab Test Data.csv");
opts = setvaropts(opts,{'x_LastOrderDate','x_FirstShowAttendedDate','x_LastShowAttendedDate','x_NextShowAttendingDate','x_LastCompShowDate'},'InputFormat',"MM/dd/yy");
opts = setvaropts(opts,'DateCreated','InputFormat','MM/dd/yy HH:mm');
segment = readtable("MatLab Test Data.csv",opts);
% This is the part that removes the rows of data with Last Order Date after cutoff date
% Specify the cutoff date
cutoffDate = datetime(2019,08,13);
% Remove rows where Date is after cut off date
segment(segment.x_LastOrderDate>cutoffDate,:)=[];
% Remove rows where Dates has a value of NaT
segment=rmmissing(segment,"DataVariables","x_LastOrderDate")
Más respuestas (1)
Cris LaPierre
el 29 de Jul. de 2020
Don't use datenums. I would suggest converting your dates to datetimes (segment{f,17}). Then you can just use regular conditional expressions (>,<.==,etc.).
4 comentarios
Cris LaPierre
el 29 de Jul. de 2020
Editada: Cris LaPierre
el 29 de Jul. de 2020
You may be making this more complicated than it needs to be. Without a representative data set, we can only guess, but I suspect something like this should work. Ignore the first few lines. I'm creating a dummy table of various data types for the example.
% Dummy data set including a range of dates including NaT entries.
Dates = [datetime(2019,01,01):days(1):datetime('today') repmat(NaT,1,24)];
segment = table(Dates',zeros(length(Dates),1),zeros(length(Dates),1,'single'),cell(length(Dates),1),strings(length(Dates),1));
segment.Properties.VariableNames=["Dates","Double","Single","Cell","String"];
% Specify the cutoff date
cutoffDate = datetime(2019,08,13);
% Remove rows where Dates has a value of NaT
segment=rmmissing(segment,"DataVariables","Dates")
% Remove rows where Date is after cut off date
segment(segment.Dates>cutoffDate,:)=[];
Since I start Dates on the first day of 2019, and since 8/13/2019 is the 225th day of the year (in 2019), the resulting size of segment should be a table with 225 rows and 5 columns.
day(cutoffDate,'dayofyear')
ans =
225
Ver también
Categorías
Más información sobre Dates and Time en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!