# rangesearch

Find all neighbors within specified distance using input data

## Syntax

``Idx = rangesearch(X,Y,r)``
``````[Idx,D] = rangesearch(X,Y,r)``````
``````[Idx,D] = rangesearch(X,Y,r,Name,Value)``````

## Description

````Idx = rangesearch(X,Y,r)` finds all the `X` points that are within distance `r` of the `Y` points. The rows of `X` and `Y` correspond to observations, and the columns correspond to variables.```
``````[Idx,D] = rangesearch(X,Y,r)``` also returns the distances between the `Y` points and the `X` points that are within a distance of `r`.```

example

``````[Idx,D] = rangesearch(X,Y,r,Name,Value)``` specifies additional options using one or more name-value pair arguments. For example, you can specify the nearest neighbor search method and the distance metric used in the search.```

example

## Examples

collapse all

Find the `X` points that are within a Euclidean distance `1.5` of each `Y` point. Both `X` and `Y` are samples of five-dimensional normally distributed variables.

```rng('default') % For reproducibility X = randn(100,5); Y = randn(10,5); [Idx,D] = rangesearch(X,Y,1.5)```
```Idx=10×1 cell array {[ 25 62 33 99 87 92 16]} {[ 92 25]} {[ 93 42 31 73 60 28 78 83 48 89 85]} {[ 92 41]} {[44 7 28 78 75 42 69 31 1 26 83 93]} {[ 15 31 89 41 27 17 29 60 34]} {[ 89]} {1x0 double } {1x0 double } {1x0 double } ```
```D=10×1 cell array {[ 0.9546 1.0987 1.2730 1.3981 1.4140 1.4249 1.4822]} {[ 1.4203 1.4558]} {[ 0.7114 0.7552 1.0081 1.1324 1.1424 1.1637 1.2108 1.3824 1.3944 1.4116 1.4605]} {[ 1.1244 1.4672]} {[0.7863 0.9326 0.9773 1.0508 1.1722 1.1934 1.3218 1.3623 1.3869 1.3919 1.4814 1.4978]} {[ 1.2824 1.2843 1.3342 1.3469 1.4154 1.4237 1.4625 1.4626 1.4744]} {[ 1.1739]} {1x0 double } {1x0 double } {1x0 double } ```

In this case, the last three `Y` points are more than `1.5` distance away from any `X` point. `X(89,:)` is `1.1739` distance away from `Y(7,:)`, and no other `X` point is within distance `1.5` of `Y(7,:)`. `X` contains `12` points within distance `1.5` of `Y(5,:)`.

Generate 5000 random points from each of three distinct multivariate normal distributions. Shift the means of the distributions so that the randomly generated points are likely to form three separate clusters.

```rng('default') % For reproducibility N = 5000; dist = 10; X = [mvnrnd([0 0],eye(2),N); mvnrnd(dist*[1 1],eye(2),N); mvnrnd(dist*[-1 -1],eye(2),N)];```

For each point in `X`, find the points in `X` that are within a radius `dist` away from the point. For faster computation, specify to keep the indices of the nearest neighbors unsorted. Select the first point in X, and find its nearest neighbors.

```Idx = rangesearch(X,X,dist,'SortIndices',false); x = X(1,:); nearestPoints = X(Idx{1},:);```

Find the values in `X` that are not the nearest neighbors of `x`. Display those points in one color and the nearest neighbors of `x` in a different color. Label the point `x` with a black, filled circle.

```nonNearestIdx = true(size(X,1),1); nonNearestIdx(Idx{1}) = false; scatter(X(nonNearestIdx,1),X(nonNearestIdx,2)) hold on scatter(nearestPoints(:,1),nearestPoints(:,2)) scatter(x(1),x(2),'black','filled') hold off```

Find the patients in the `patients` data set that are within a certain age and weight range of the patients in `Y`.

Load the `patients` data set. The `Age` values are in years, and the `Weight` values are in pounds.

```load patients X = [Age Weight]; Y = [20 162; 30 169; 40 168]; % New patients```

Create a custom distance function `distfun` that determines the distance between patients in terms of age and weight. For example, according to `distfun`, two patients that are one year apart in age and have the same weight are one distance unit apart. Similarly, two patients that have the same age and are five pounds apart in weight are also one distance unit apart.

`type distfun.m % Display contents of distfun.m file`
```function D2 = distfun(ZI,ZJ) ageDifference = abs(ZI(1)-ZJ(:,1)); weightDifference = abs(ZI(2)-ZJ(:,2)); D2 = ageDifference + 0.2*weightDifference; end ```

Note: If you click the button located in the upper-right section of this example and open the example in MATLAB®, then MATLAB opens the example folder. This folder includes the function file `distfun.m`.

Find the patients in `X` that are within the distance `2` of the patients in `Y`.

`[Idx,D] = rangesearch(X,Y,2,'Distance',@distfun)`
```Idx=3×1 cell array {1x0 double} {1x0 double} {[ 41]} ```
```D=3×1 cell array {1x0 double} {1x0 double} {[ 1.8000]} ```

The third patient in `Y` is the only one to have a patient in `X` within a distance of `2`.

Display the `Age` and `Weight` values for the nearest patient in `X` to the patient with age `40` and weight `168`.

`X(Idx{3},:)`
```ans = 1×2 39 164 ```

## Input Arguments

collapse all

Input data, specified as an mx-by-n numeric matrix, where each row represents one n-dimensional point. The number of columns n must equal the number of columns in `Y`.

Data Types: `single` | `double`

Query points, specified as an my-by-n numeric matrix, where each row represents one n-dimensional point. The number of columns n must equal the number of columns in `X`.

Data Types: `single` | `double`

Search radius around each query point, specified as a nonnegative scalar. `rangesearch` finds all `X` points (rows) that are within distance `r` of each `Y` point. The meaning of distance depends on the `'Distance'` name-value pair argument.

Data Types: `single` | `double`

### Name-Value Arguments

Specify optional pairs of arguments as `Name1=Value1,...,NameN=ValueN`, where `Name` is the argument name and `Value` is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose `Name` in quotes.

Example: `rangesearch(X,Y,1.4,'Distance','seuclidean','Scale',iqr(X))` specifies to find all the observations in `X` within distance `1.4` of each observation in `Y`, using a standardized Euclidean distance scaled by the interquartile range of `X`.

Nearest neighbor search method, specified as the comma-separated pair consisting of `'NSMethod'` and one of these values.

ValueDescription
`'kdtree'`

Create and use a Kd-tree to find nearest neighbors. `'kdtree'` is valid only when the distance metric is one of these options:

• `'chebychev'`

• `'cityblock'`

• `'euclidean'`

• `'minkowski'`

`'exhaustive'`Use the exhaustive search algorithm. The software computes the distances from all `X` points to each `Y` point to find nearest neighbors.

`'kdtree'` is the default value when the number of columns in `X` is less than or equal to `10`, `X` is not sparse, and the distance metric is one of the valid `'kdtree'` metrics. Otherwise, the default value is `'exhaustive'`.

Example: `'NSMethod','exhaustive'`

Distance metric that `rangesearch` uses, specified as the comma-separated pair consisting of `'Distance'` and one of the values in this table.

ValueDescription
`'euclidean'`Euclidean distance.
`'seuclidean'`Standardized Euclidean distance. Each coordinate difference between a row in `X` and a query point is scaled by dividing by the corresponding element of the standard deviation computed from `X`, `std(X,'omitnan')`. To specify another scaling, use the `'Scale'` name-value pair argument.
`'mahalanobis'`Mahalanobis distance, computed using a positive definite covariance matrix `C`. The default value of `C` is the sample covariance matrix of `X`, as computed by `cov(X,'omitrows')`. To specify a different value for `C`, use the `'Cov'` name-value pair argument.
`'cityblock'`City block distance.
`'minkowski'`Minkowski distance. The default exponent is `2`. To specify a different exponent, use the `'P'` name-value pair argument.
`'chebychev'`Chebychev distance (maximum coordinate difference).
`'cosine'`One minus the cosine of the included angle between observations (treated as vectors).
`'correlation'`One minus the sample linear correlation between observations (treated as sequences of values).
`'hamming'`Hamming distance, the percentage of coordinates that differ.
`'jaccard'`One minus the Jaccard coefficient, the percentage of nonzero coordinates that differ.
`'spearman'`One minus the sample Spearman's rank correlation between observations (treated as sequences of values).
`@distfun`

Custom distance function handle. A distance function has the form

```function D2 = distfun(ZI,ZJ) % calculation of distance ...```
where

• `ZI` is a 1-by-n vector containing one row of `X` or `Y`.

• `ZJ` is an m-by-n matrix containing multiple rows of `X` or `Y`.

• `D2` is an m-by-1 vector of distances, and `D2(j)` is the distance between the observations `ZI` and `ZJ(j,:)`.

Example: `'Distance','minkowski'`

Exponent for the Minkowski distance metric, specified as the comma-separated pair consisting of `'P'` and a positive scalar. This argument is valid only if `'Distance'` is `'minkowski'`.

Example: `'P',4`

Data Types: `single` | `double`

Covariance matrix for the Mahalanobis distance metric, specified as the comma-separated pair consisting of `'Cov'` and a positive definite matrix. This argument is valid only when `'Distance'` is `'mahalanobis'`.

Example: `'Cov',eye(4)`

Data Types: `single` | `double`

Scale parameter value for the standardized Euclidean distance metric, specified as the comma-separated pair consisting of `'Scale'` and a nonnegative vector. `Scale` has length equal to the number of columns in `X`. Each coordinate difference between a row in `X` and a query point is scaled by the corresponding element of `Scale`. This argument is valid only when `'Distance'` is `'seuclidean'`.

Example: `'Scale',iqr(X)`

Data Types: `single` | `double`

Maximum number of data points in the leaf node of the Kd-tree, specified as the comma-separated pair consisting of `'BucketSize'` and a positive integer scalar. This argument is valid only when `NSMethod` is `'kdtree'`.

Example: `'BucketSize',20`

Data Types: `single` | `double`

Flag to sort returned indices according to distance, specified as the comma-separated pair consisting of `'SortIndices'` and either `true` (`1`) or `false` (`0`).

For faster performance when `Y` contains many observations that have many nearest points in `X`, you can set `SortIndices` to `false`. In this case, `rangesearch` returns the indices of the nearest points in no particular order. When `SortIndices` is `true`, the function arranges the indices of the nearest points in ascending order by distance.

Example: `'SortIndices',false`

Data Types: `logical`

## Output Arguments

collapse all

Indices of nearest points, returned as a cell array of numeric vectors.

`Idx` is an my-by-`1` cell array, where my is the number of rows in `Y`. The vector `Idx{j}` contains the indices of points (rows) in `X` whose distances to `Y(j,:)` are not greater than `r`. If `SortIndices` is `true`, then `rangesearch` arranges the indices in ascending order by distance.

Distances of the nearest points to the query points, returned as a cell array of numeric vectors.

`D` is an my-by-`1` cell array, where my is the number of rows in `Y`. `D{j}` contains the distance values between `Y(j,:)` and the points (rows) in `X(Idx{j},:)`. If `SortIndices` is `true`, then `rangesearch` arranges the distances in ascending order.

## Alternative Functionality

If you set the `rangesearch` function `'NSMethod'` name-value pair argument to the appropriate value (`'exhaustive'` for an exhaustive search algorithm or `'kdtree'` for a Kd-tree algorithm), then the search results are equivalent to the results obtained by conducting a distance search using the `rangesearch` object function. Unlike the `rangesearch` function, the `rangesearch` object function requires an `ExhaustiveSearcher` or `KDTreeSearcher` model object.

## Version History

Introduced in R2011b