**Ahora está siguiendo esta pregunta**

- Verá actualizaciones en las notificaciones de contenido en seguimiento.
- Podrá recibir correos electrónicos, en función de las preferencias de comunicación que haya establecido.

# How can I reassign clusters based on similarity or any other method?

##### 23 comentarios

Hi @ Med Future,

Can you share your code on this form?

Also, please elaborate when you mentioned,

*I have already tried the K means clustering but it does not provide a results**

Hi @Med Future ,

I have modified your code shared on the form and it is capable of reassigning clusters based on similarity.

% Define cell1 and cell2

cell1 = [1, 2, 3; 4, 5, 6]; % Example data for cell1

cell2 = [7, 8, 9; 10, 11, 12]; % Example data for cell2

% Normalize the rows of the cells for cosine similarity

cell1_norm = cell1 ./ sqrt(sum(cell1.^2, 2));

cell2_norm = cell2 ./ sqrt(sum(cell2.^2, 2));

% Compute the cosine similarity matrix

similarity_matrix = cell1_norm * cell2_norm';

% Average similarity score

similarity_score = mean(similarity_matrix(:));

% Display the similarity score

fprintf('Average Cosine Similarity Score: %f\n', similarity_score);

% Define the threshold for similarity to reassign clusters

similarity_threshold = 0.9;

if similarity_score > similarity_threshold

% Combine the data from both cells

combinedData = [cell1; cell2];

% Apply K-means clustering

k = 2; % Define the number of clusters 'k'

[idx, C] = kmeans(combinedData, k);

% Calculate centroid distances for cluster reassignment

centroid_distances = pdist(C); % Calculate pairwise distances between centroids

avg_distance = mean(centroid_distances); % Calculate the average centroid distance

% Reassign clusters if centroid distances exceed a certain threshold

centroid_threshold = 5; % Define a threshold for centroid distances

if avg_distance > centroid_threshold

% Calculate the pairwise distances between data points and centroids distances = pdist2(combinedData, C);

% Find the minimum distance for each data point

[~, min_indices] = min(distances, [], 2);

% Update the cluster assignments in 'idx' based on the minimum distances

idx = min_indices;

end

% Iterate over the clusters and check for different features

unique_clusters = unique(idx); % Get the unique cluster labels

num_clusters = numel(unique_clusters); % Get the number of clusters

for i = 1:num_clusters

cluster_data = combinedData(idx == unique_clusters(i), :); % Get the data points for the current cluster

% Check for different features within the cluster

if any(range(cluster_data) > 1)

% Split the cluster into subclusters with similar features

subclusters = kmeans(cluster_data, 2);

% Update the cluster assignments in 'idx' for the subclusters

idx(idx == unique_clusters(i)) = subclusters + max(idx);

end

end

% Merge clusters with similar features

unique_clusters = unique(idx); % Get the updated unique cluster labels

num_clusters = numel(unique_clusters); % Get the updated number of clusters

for i = 1:num_clusters

cluster_data = combinedData(idx == unique_clusters(i), :); % Get the data points for the current cluster

% Check for similar features with other clusters

for j = i+1:num_clusters

other_cluster_data = combinedData(idx == unique_clusters(j), :); % Get the data points for the other cluster

% Check for similar features using a threshold

if max(pdist2(cluster_data, other_cluster_data)) < 1

% Merge the clusters into a single cluster

idx(idx == unique_clusters(j)) = unique_clusters(i);

end

end

end

% Display the updated clustering results

figure;

gscatter(combinedData(:,1), combinedData(:,2), idx);

title('Modified Clustering Results');

% Save the modified clustering results

save('modified_clustered_data.mat', 'idx', 'combinedData');

else

fprintf('Similarity score is less than %f, not reassigning clusters.\n', similarity_threshold);

end

I will go through the code step by step to let you understand how it achieves this. First, the code defines two cells, cell1 and cell2, which contain example data for clustering. These cells represent the clusters that need to be reassigned based on similarity.

cell1 = [1, 2, 3; 4, 5, 6]; % Example data for cell1

cell2 = [7, 8, 9; 10, 11, 12]; % Example data for cell2

Next, the code normalizes the rows of the cells using the cosine similarity measure. This normalization step ensures that the similarity between clusters is calculated accurately.

cell1_norm = cell1 ./ sqrt(sum(cell1.^2, 2));

cell2_norm = cell2 ./ sqrt(sum(cell2.^2, 2));

After normalizing the cells, the code computes the cosine similarity matrix between cell1_norm and cell2_norm. The similarity matrix represents the pairwise similarity between each data point in cell1 and cell2.

similarity_matrix = cell1_norm * cell2_norm';

To determine the average similarity score between the clusters, the code calculates the mean of all elements in the similarity matrix.

similarity_score = mean(similarity_matrix(:));

The code then displays the average cosine similarity score.

fprintf('Average Cosine Similarity Score: %f\n', similarity_score);

Next, the code defines a similarity threshold. If the similarity score is greater than the threshold, the clusters will be reassigned based on similarity.

similarity_threshold = 0.9;

The code checks if the similarity score exceeds the threshold. If it does, the clusters will be reassigned.

if similarity_score > similarity_threshold

% Combine the data from both cells

combinedData = [cell1; cell2];

% Apply K-means clustering

k = 2; % Define the number of clusters 'k'

[idx, C] = kmeans(combinedData, k);

The code then calculates the centroid distances between the clusters. If the average centroid distance exceeds a certain threshold, the clusters will be reassigned.

centroid_distances = pdist(C); % Calculate pairwise distances between centroids

avg_distance = mean(centroid_distances); % Calculate the average centroid distance

% Reassign clusters if centroid distances exceed a certain threshold

centroid_threshold = 5; % Define a threshold for centroid distances

if avg_distance > centroid_threshold

% Calculate the pairwise distances between data points and centroids

distances = pdist2(combinedData, C);

% Find the minimum distance for each data point

[~, min_indices] = min(distances, [], 2);

% Update the cluster assignments in 'idx' based on the minimum distances

idx = min_indices;

end

The code then iterates over the clusters and checks for different features within each cluster. If a cluster has different features, it will be split into subclusters with similar features.

unique_clusters = unique(idx); % Get the unique cluster labels

num_clusters = numel(unique_clusters); % Get the number of clusters

for i = 1:num_clusters

cluster_data = combinedData(idx == unique_clusters(i), :); % Get the data points for the current cluster

% Check for different features within the cluster

if any(range(cluster_data) > 1)

% Split the cluster into subclusters with similar features

subclusters = kmeans(cluster_data, 2);

% Update the cluster assignments in 'idx' for the subclusters

idx(idx == unique_clusters(i)) = subclusters + max(idx);

end

end

After splitting clusters with different features, the code merges clusters with similar features. It iterates over the clusters and compares their features using a threshold. If the features are similar, the clusters will be merged into a single cluster.

unique_clusters = unique(idx); % Get the updated unique cluster labels

num_clusters = numel(unique_clusters); % Get the updated number of clusters

for i = 1:num_clusters

% Check for similar features with other clusters

for j = i+1:num_clusters

other_cluster_data = combinedData(idx == unique_clusters(j), :); % Get the data points for the other cluster

% Check for similar features using a threshold

if max(pdist2(cluster_data, other_cluster_data)) < 1

% Merge the clusters into a single cluster

idx(idx == unique_clusters(j)) = unique_clusters(i);

end

end

end

Finally, the code displays the updated clustering results by plotting the data points with their assigned clusters.

% Display the updated clustering results

figure;

gscatter(combinedData(:,1), combinedData(:,2), idx);

title('Modified Clustering Results');

% Save the modified clustering results

save('modified_clustered_data.mat', 'idx', 'combinedData');

else

fprintf('Similarity score is less than %f, not reassigning clusters.\n', similarity_threshold);

end

In nutshell, this modified code is capable of reassigning clusters based on similarity. It combines clusters with the same features, splits clusters with different features, and merges clusters with similar features. The code utilizes the K-means clustering algorithm and cosine similarity to achieve this. Please see attached plot along with test results.

Hope, this answers your question.

### Respuestas (1)

##### 19 comentarios

### Ver también

### Categorías

### Etiquetas

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!**Se ha producido un error**

No se puede completar la acción debido a los cambios realizados en la página. Vuelva a cargar la página para ver el estado actualizado.

Seleccione un país/idioma

Seleccione un país/idioma para obtener contenido traducido, si está disponible, y ver eventos y ofertas de productos y servicios locales. Según su ubicación geográfica, recomendamos que seleccione: .

También puede seleccionar uno de estos países/idiomas:

Cómo obtener el mejor rendimiento

Seleccione China (en idioma chino o inglés) para obtener el mejor rendimiento. Los sitios web de otros países no están optimizados para ser accedidos desde su ubicación geográfica.

América

- América Latina (Español)
- Canada (English)
- United States (English)

Europa

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)

Asia-Pacífico

- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)