Documentación

Esta página aún no se ha traducido para esta versión. Puede ver la versión más reciente de esta página en inglés.

# cluster

## Sintaxis

```T = cluster(Z,'cutoff',c) T = cluster(Z,'cutoff',c,'depth',d) T = cluster(Z,'cutoff',c,'criterion',criterion) T = cluster(Z,'maxclust',n) ```

## Description

`T = cluster(Z,'cutoff',c)` constructs clusters from the agglomerative hierarchical cluster tree, `Z`, as generated by the `linkage` function. `Z` is a matrix of size (`m` – 1)-by-3, where `m` is the number of observations in the original data. `c` is a threshold for cutting `Z` into clusters. Clusters are formed when a node and all of its subnodes have `inconsistent` value less than `c`. All leaves at or below the node are grouped into a cluster. `t` is a vector of size `m` containing the cluster assignments of each observation.

If `c` is a vector, `T` is a matrix of cluster assignments with one column per cutoff value.

`T = cluster(Z,'cutoff',c,'depth',d)` evaluates inconsistent values by looking to a depth `d` below each node. The default depth is `2`.

`T = cluster(Z,'cutoff',c,'criterion',criterion)` uses the specified criterion for forming clusters, where `criterion` is `'inconsistent'` (default) or `'distance'`. The `'distance'` criterion uses the distance between the two subnodes merged at a node to measure node height. All leaves at or below a node with height less than `c` are grouped into a cluster.

`T = cluster(Z,'maxclust',n)` constructs a maximum of `n` clusters using the `'distance'` criterion. `cluster` finds the smallest height at which a horizontal cut through the tree leaves `n` or fewer clusters.

If `n` is a vector, `T` is a matrix of cluster assignments with one column per maximum value.

## Ejemplos

contraer todo

Compare cluster assignments of flowers to their species classification.

`load fisheriris`

Compute three clusters of the Fisher iris data using the `'average'` method and the `'chebychev'` metric.

```Z = linkage(meas,'average','chebychev'); c = cluster(Z,'maxclust',3);```

Create a dendrogram plot of `Z`. To see the three clusters, use `'ColorThreshold'` with a cutoff halfway between the third-from-last and second-from-last linkages.

```cutoff = median([Z(end-2,3) Z(end-1,3)]); dendrogram(Z,'ColorThreshold',cutoff)``` Display the last two rows of `Z` to see how the three clusters are combined into one. `linkage` combines the 293rd (blue) cluster with the 297th (red) cluster to form the 298th cluster with a linkage of `1.7583`. `linkage` then combines the 296th (green) cluster with the 298th cluster.

`lastTwo = Z(end-1:end,:)`
```lastTwo = 2×3 293.0000 297.0000 1.7583 296.0000 298.0000 3.4445 ```

See how the cluster assignments correspond to the three species. For example, one of the clusters contains `50` flowers of the second species and `40` flowers of the third species.

`crosstab(c,species)`
```ans = 3×3 0 0 10 0 50 40 50 0 0 ```

Randomly generate sample data with 20,000 observations.

```rng('default') % For reproducibility X = rand(20000,3);```

Create a hierarchical cluster tree using Ward's linkage. In this case, `'savememory'` is set to `'on'` by default. In general, choose the best value for `'savememory'` based on the dimensions of `X` and the available memory.

`Z = linkage(X,'ward');`

Cluster data into four groups and plot the result.

```c = cluster(Z,'Maxclust',4); scatter3(X(:,1),X(:,2),X(:,3),10,c)``` 