Creating a new matrix based on matching two columns
Mostrar comentarios más antiguos
I have generated two matrices from two different csv files. Matrix 1 contains two columns of information (matrix contain middle names and first name). Matrix 2 only contains one Column of information (middle name). I want to create a loop where I take the middle name from matrix 2 and match it to the middle names in matrix 1. Once the names have been matched I want to read out each matches corresponding first name in a new matrix
3 comentarios
the cyclist
el 17 de Oct. de 2023
Can you upload the data? You can use the paper clip icon in the INSERT section of the toolbar.
One thing that is confusing is that you refer to the data as "matrices", but in MATLAB matrices are numeric. So, it's not clear if you have cell arrays, or string arrays, or tables. Uploading the data is the easiest way for us to make sure a solution works for you.
User
el 26 de Oct. de 2023
Respuestas (1)
T1 = readtable('example1.xlsx');
T2 = readtable('example2.xlsx');
disp(T1);
disp(T2);
"I want to [...] take the [last] name from [array] 2 and match it to the [last] names in [array] 1. Once the names have been matched I want to read out each [match's] corresponding first name in a new [array]"
If you want exact matches:
result = T1{ismember(T1{:,2},T2{:,1}),1}
Or, if you want to split the cells where there are more than one last name separated by commas into multiple separate entries, then something like this:
% make a new table with only one first and last name per row:
C = regexp(T1{:,2},',\s*','split');
idx = arrayfun(@(x,c)x(ones(1,numel(c{1}))),1:numel(C),C.','un',0);
T1_new = table(T1{[idx{:}],1},[C{:}].')
% this part is the same as before with T1 but now using T1_new:
result = T1_new{ismember(T1_new{:,2},T2{:,1}),1}
5 comentarios
User
el 1 de Nov. de 2023
T1 = readtable('example1.xlsx');
T2 = readtable('example2.xlsx');
% exact matches only:
T_out = T1(ismember(T1{:,2},T2{:,1}),:)
result = join(T_out{:,:},' ')
% matches any last name in given row of T1:
C = regexp(T1{:,2},',\s*','split');
idx = arrayfun(@(x,c)x(ones(1,numel(c{1}))),1:numel(C),C.','un',0);
T1_new = table(T1{[idx{:}],1},[C{:}].');
T_out = T1_new(ismember(T1_new{:,2},T2{:,1}),:)
result = join(T_out{:,:},' ')
User
el 1 de Nov. de 2023
Walter Roberson
el 1 de Nov. de 2023
With the original last name is something I posted code for several days ago at https://www.mathworks.com/matlabcentral/answers/2034499-loading-in-a-table-that-has-multiple-values-in-a-single-cell-seperated-by-a-comma#comment_2938846
@User: Does this produce the expected result? The only change is changing the comma to a semicolon in the regexp() call, since the IDs are separated by semicolons in the real data file (it was commas in the example files).
T1 = readtable('allPharm1.csv')
T2 = readtable('drugID_names_matched_CCM.csv')
C = regexp(T1{:,2},';\s*','split');
idx = arrayfun(@(x,c)x(ones(1,numel(c{1}))),1:numel(C),C.','un',0);
T1_new = table(T1{[idx{:}],1},[C{:}].');
T_out = T1_new(ismember(T1_new{:,2},T2{:,1}),:)
result = join(T_out{:,:},' ')
Categorías
Más información sobre Variables en Centro de ayuda y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!