Dendrogram and Clustergram not matching
7 views (last 30 days)
I'm running MATLAB R2020a and getting differences between the clustergram function and the linkage>dendrogram functions.
here are my two codes, I've also attached the screenshots of their respective figures.
Clustergram: g = clustergram(X,'RowLabels', Z, 'ColumnLabels', Y,'ColumnPdist', 'euclidean','RowPdist', 'euclidean','Linkage', 'complete','ImputeFun',@knnimpute);
Linkage/Dendrogram: tree = linkage(X, 'complete','euclidean'); H = dendrogram(tree,'Labels',Y)
the order and linking of my objects differs between using clustergram and dendrogram even though they both use 'linkage' and I set both to use 'euclidean' and 'complete'. The objects are paired similarly, but their orientation across the group is different.
For the clustergram, the order is: EACDFB which would make me think that B and E are the most different.
For the dendrogram, the order is: CDAEBF which now makes me think C and F are the most different.
Please let me know if I am missing something, as of now I don't know which is more/less correct!
Rajani Mishra on 29 Jul 2020
Clustergram uses optimal leaf-order which is causing the difference in the ordering in the results.
I have added optimal leaf order in below code. It produces same output for both the functions (above images):
X = rand(10,3);
tree = linkage(X,'complete','euclidean');
D = pdist(X);
leafOrder = optimalleaforder(tree,D);
label = ["a", "b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t"];
[a,b] = dendrogram(tree,'Reorder',leafOrder,'Labels',label);
cgo = clustergram(X','ColumnPdist', 'euclidean','RowPdist', 'euclidean','Linkage', 'complete','ColumnLabels',label,'ImputeFun',@knnimpute);