Esta página aún no se ha traducido para esta versión. Puede ver la versión más reciente de esta página en inglés.

mahal

Mahalanobis distance

Sintaxis

d2 = mahal(Y,X)

Descripción

ejemplo

d2 = mahal(Y,X) returns the squared Mahalanobis distance of each observation in Y to the reference samples in X.

Ejemplos

contraer todo

Generate a correlated bivariate sample data set.

rng('default') % For reproducibility
X = mvnrnd([0;0],[1 .9;.9 1],1000);

Specify four observations that are equidistant from the mean of X in Euclidean distance.

Y = [1 1;1 -1;-1 1;-1 -1];

Compute the Mahalanobis distance of each observation in Y to the reference samples in X.

d2_mahal = mahal(Y,X)
d2_mahal = 4×1

    1.1095
   20.3632
   19.5939
    1.0137

Compute the squared Euclidean distance of each observation in Y from the mean of X .

d2_Euclidean = sum((Y-mean(X)).^2,2)
d2_Euclidean = 4×1

    2.0931
    2.0399
    1.9625
    1.9094

Plot X and Y by using scatter and use marker color to visualize the Mahalanobis distance of Y to the reference samples in X.

scatter(X(:,1),X(:,2),10,'.') % Scatter plot with points of size 10
hold on
scatter(Y(:,1),Y(:,2),100,d2_mahal,'o','filled')
hb = colorbar;
ylabel(hb,'Mahalanobis Distance')
legend('X','Y','Location','best')

All observations in Y ([1,1], [-1,-1,], [1,-1], and [-1,1]) are equidistant from the mean of X in Euclidean distance. However, [1,1] and [-1,-1] are much closer to X than [1,-1] and [-1,1] in Mahalanobis distance. Because Mahalanobis distance considers the covariance of the data and the scales of the different variables, it is useful for detecting outliers.

Argumentos de entrada

contraer todo

Data, specified as an n-by-m numeric matrix, where n is the number of observations and m is the number of variables in each observation.

X and Y must have the same number of columns, but can have different numbers of rows.

Tipos de datos: single | double

Reference samples, specified as a p-by-m numeric matrix, where p is the number of samples and m is the number of variables in each sample.

X and Y must have the same number of columns, but can have different numbers of rows. X must have more rows than columns.

Tipos de datos: single | double

Output Arguments

contraer todo

Squared Mahalanobis distance of each observation in Y to the reference samples in X, returned as an n-by-1 numeric vector, where n is the number of observations in X.

Más acerca de

contraer todo

Mahalanobis Distance

The Mahalanobis distance is a measure between a sample point and a distribution.

The Mahalanobis distance from a vector y to a distribution with mean μ and covariance Σ is

d=(yμ)1(yμ)'.

This distance represents how far y is from the mean in number of standard deviations.

mahal returns the squared Mahalanobis distance d2 from an observation in Y to the reference samples in X. In the mahal function, μ and Σ are the sample mean and covariance of the reference samples, respectively.

Consulte también

|

Introducido antes de R2006a