Join/Append tables and keep particular table variable if duplicate rows

I have a an ecxel table that is updated inside excel after being generated in MATLAB with a column of some user text comments- these comments are not always present. I would like to add new rows and often duplicate rows to this table, but the new/duplicate rows will never contain comments. In the event there is a duplicate row added for the key variables, I would like to preserve the comment from the excel file if it is present, otherwise it will remain or be created as blank.
Not sure the best approach here- my various attempts at joins are not working well.
Here is an example of the excel file with comments and some new data containing duplicate rows:
excel=readtable('commented.xlsx')
excel = 6×4 table
Date Time Value Comment ____________ __________ _____ _____________ {'20220912'} {'015304'} 0.1 {0×0 char } {'20220912'} {'172427'} 0 {0×0 char } {'20220914'} {'002824'} -0.7 {0×0 char } {'20220915'} {'042702'} -0.7 {'Windy Day'} {'20220915'} {'170959'} 0.2 {0×0 char } {'20220916'} {'231015'} -0.9 {0×0 char }
load newdata.mat
newdata
newdata = 6×3 table
Date Time Value ____________ __________ _____ {'20220913'} {'087626'} -0.4 {'20220914'} {'002824'} -0.7 {'20220915'} {'042702'} -0.7 {'20220915'} {'170959'} 0.2 {'20220916'} {'231015'} -0.9 {'20220917'} {'115816'} -0.8
The answer would look like this- all new rows from newdata are addded, but when there are duplicate Date/Time that have a comment, that comment is preserved, ('20220915' '042702' would keep the 'Windy Day' comment) Otherwise, the Comment variable is an empty char cell.
load answer
answer
answer = 8×4 table
Date Time Value Comment ____________ __________ _____ _____________ {'20220912'} {'015304'} 0.1 {0×0 char } {'20220912'} {'172427'} 0 {0×0 char } {'20220913'} {'087626'} -0.4 {0×0 char } {'20220914'} {'002824'} -0.7 {0×0 char } {'20220915'} {'042702'} -0.7 {'Windy Day'} {'20220915'} {'170959'} 0.2 {0×0 char } {'20220916'} {'231015'} -0.9 {0×0 char } {'20220917'} {'115816'} -0.8 {0×0 char }

4 comentarios

Jan
Jan el 20 de Feb. de 2023
Editada: Jan el 20 de Feb. de 2023
What exactly is your question? Do you want the output to be:
...
{'20220915'} {'042702'} -0.7 {'Windy Day'}
{'20220915'} {'170959'} 0.2 {'Windy Day'} % Comment copied
...
You have mentioned joining the tables. You did not do this in the posted code, so the actual question might be: how to remove a duplicate row before joining tables. Please clarify this.
No, I want the output to be as in answer.mat
All entries to the Comment variable should preserve what is in the original excel file. If blank they will remain blank, and if a new row is added, it should also have a blank.
My question is how to join the two tables while preserving the Comment variable when rows with duplicate key entries but no Comment variable can be added in the middle or end.
The blank comment on the second 20220915 observation at 170959 should remain blank- there was not a comment added there. The first entry should preserve the 'Windy Day'.
{'20220915'} {'042702'} -0.7 {'Windy Day'} %comment preserved, duplicate row from newdata ignored
{'20220915'} {'170959'} 0.2 {0×0 char } %original blank comment preserved
@Marcus Glover: The example from your last comment is the result of a standard concatenation. But in the original question joining two tables with 6 elements does not create a table with 12 elements. So what should be removed?
Marcus Glover
Marcus Glover el 20 de Feb. de 2023
Editada: Marcus Glover el 20 de Feb. de 2023
Thanks. Your questions have given me the answer. I do just want to remove the four rows in the newdata table that have duplicate rows in the excel table. The I do not need a join at all and can just concatenate as you suggest after creating a dummy variable 'Comment' for those that are not duplicate rows.

Iniciar sesión para comentar.

 Respuesta aceptada

Jan's question led me to the answer.
commented=readtable('commented.xlsx');
load newdata.mat;
load answer;
% eliminate duplicate rows
idx = ismember(newdata(:,1:3), commented(:,1:3));
newdata(idx,:)=[];
% Create a new variable with class cell and contents 0x0 char
newVar = cell(height(newdata), 1);
newVar(:) = {''};
newdata.Comment = newVar
newdata = 2×4 table
Date Time Value Comment ____________ __________ _____ __________ {'20220913'} {'087626'} -0.4 {0×0 char} {'20220917'} {'115816'} -0.8 {0×0 char}
answer2=sortrows([newdata; commented],[1 2])
answer2 = 8×4 table
Date Time Value Comment ____________ __________ _____ _____________ {'20220912'} {'015304'} 0.1 {0×0 char } {'20220912'} {'172427'} 0 {0×0 char } {'20220913'} {'087626'} -0.4 {0×0 char } {'20220914'} {'002824'} -0.7 {0×0 char } {'20220915'} {'042702'} -0.7 {'Windy Day'} {'20220915'} {'170959'} 0.2 {0×0 char } {'20220916'} {'231015'} -0.9 {0×0 char } {'20220917'} {'115816'} -0.8 {0×0 char }

Más respuestas (0)

Categorías

Etiquetas

Preguntada:

el 20 de Feb. de 2023

Editada:

el 20 de Feb. de 2023

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by