How do I merge a column from one table into the column of another table that has missing entries?

130 views (last 30 days)
I have two tables with different column names and different number of rows. Column 3 of table A for example is missing some entries that are present in column 2 of table B. I want to add the missing entries from B into A based on the values in another column, say column 2 of table A. However, not all the same entries are present in the two tables for the key.
For example, the two tables look like this:
 
A = table({'John','Jane','Jim','Jerry','Jill'}',[1;2;1;3;1], {'', '', '', 'Mat',''}', ...
    'VariableNames',{'Employee' 'Department', 'Manager'})
 
B = table([1 2]',{'Mary' 'Mike'}',...
    'VariableNames',{'Department' 'Manager'})
 
If I try using the "join" function, I get an error because there are more Departments in A than in B.
  >> join(A,B)
Error using table/join (line 128)
The key variables for A and B cannot contain any missing values.
  How can I do this?

Accepted Answer

MathWorks Support Team
MathWorks Support Team on 30 Oct 2014
There are two possible solutions depending on the output you expect to see:
1. Using "outerjoin"
Using the "outerjoin" function and specifying the variable "Keys" to use to merge the tables produces the output:
 
C = outerjoin(A,B,'Keys','Department','MergeKeys', 1)
 
2. Using "ismember"
Using the "ismember" function to determine the matching rows in A and B and adding the missing entries in A from B produces the output:
  [idxA, idxB] = ismember(A.Department, B.Department);
A(idxA,'Manager') = B(idxB(idxA),'Manager');
 

More Answers (2)

Liqun
Liqun on 30 Apr 2015
I have a similar problem, suppose I have table A and B as follows:
A = table([1;2;3;4;5;6;7;8;9], [2;4;6;8;10;12;14;16;18], 'VariableNames',{'x' 'y'})
A =
x y
_ __
1 2
2 4
3 6
4 8
5 10
6 12
7 14
8 16
9 18
B = table([1;3;5;7], [1;9;25;49], 'VariableNames',{'x' 'z'})
B =
x z
_ __
1 1
3 9
5 25
7 49
How can I get table C as below?
C =
x y z
_ __ ___
1 2 1
2 4
3 6 9
4 8
5 10 25
6 12
7 14 49
8 16
9 18
  2 Comments
Adam Hug
Adam Hug on 13 Jul 2015
Hi Liquen,
Tables need to have values in each entry. In your case, all entries need to have a numeric value associated with them. For those numeric entries that are undefined after performing "outerjoin", MATLAB fills them with NaN values. The main point of using a table data structure is to organize data in a uniform manner. If you do not want NaN values to be added, you may wish to consider using a cell array to store your data.
Adam

Sign in to comment.


Omer Moussaffi
Omer Moussaffi on 10 Oct 2017
There's a generic way to do this in matlab 2017 and on:
outerjoin(A,B,'MergeKeys',true)

Tags

No tags entered yet.

Products


Release

R2014b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by