Documentation

### This is machine translation

Translated by
Mouseover text to see original. Click the button below to return to the English version of the page.

## Surrogate Optimization Algorithm

### Serial `surrogateopt` Algorithm

#### Serial `surrogateopt` Algorithm Overview

The surrogate optimization algorithm alternates between two phases.

• Construct Surrogate — Create `options.MinSurrogatePoints` random points within the bounds. Evaluate the (expensive) objective function at these points. Construct a surrogate of the objective function by interpolating a radial basis function through these points.

• Search for Minimum — Search for a minimum of the objective function by sampling several thousand random points within the bounds. Evaluate a merit function based on the surrogate value at these points and on the distances between them and points where the (expensive) objective function has been evaluated. Choose the best point as a candidate, as measured by the merit function. Evaluate the objective function at the best candidate point. This point is called an adaptive point. Update the surrogate using this value and search again.

During the Construct Surrogate phase, the algorithm constructs sample points from a quasirandom sequence. Constructing an interpolating radial basis function takes at least `nvars` + 1 sample points, where `nvars` is the number of problem variables. The default value of `options.MinSurrogatePoints` is `2*nvars` or `20`, whichever is larger.

The algorithm stops the Search for Minimum phase when all the search points are too close (less than the option `MinSampleDistance`) to points where the objective function was previously evaluated. See Search for Minimum Details. This switch from the Search for Minimum phase is called surrogate reset.

#### Definitions for Surrogate Optimization

The surrogate optimization algorithm description uses the following definitions.

• Current — The point where the objective function was evaluated most recently.

• Incumbent — The point with the smallest objective function value among all evaluated since the most recent surrogate reset.

• Best — The point with the smallest objective function value among all evaluated so far.

• Initial — The points, if any, that you pass to the solver in the `InitialPoints` option.

• Random points — Points in the Construct Surrogate phase where the solver evaluates the objective function. Generally, the solver takes these points from a quasirandom sequence, scaled and shifted to remain within the bounds. A quasirandom sequence is similar to a pseudorandom sequence such as `rand` returns, but is more evenly spaced. See https://en.wikipedia.org/wiki/Low-discrepancy_sequence. However, when the number of variables is above 500, the solver takes points from a Latin hypercube sequence. See https://en.wikipedia.org/wiki/Latin_hypercube_sampling.

• Adaptive points — Points in the Search for Minimum phase where the solver evaluates the objective function.

• Merit function — See Merit Function Definition.

• Evaluated points — All points at which the objective function value is known. These points include initial points, Construct Surrogate points, and Search for Minimum points at which the solver evaluates the objective function.

• Sample points. Pseudorandom points where the solver evaluates the merit function during the Search for Minimum phase. These points are not points at which the solver evaluates the objective function, except as described in Search for Minimum Details.

#### Construct Surrogate Details

To construct the surrogate, the algorithm chooses quasirandom points within the bounds. If you pass an initial set of points in the `InitialPoints` option, the algorithm uses those points and new quasirandom points (if necessary) to reach a total of `options.MinSurrogatePoints`. On subsequent Construct Surrogate phases, the algorithm uses `options.MinSurrogatePoints` quasirandom points. The algorithm evaluates the objective function at these points.

The algorithm constructs a surrogate as an interpolation of the objective function by using a radial basis function (RBF) interpolator. RBF interpolation has several convenient properties that make it suitable for constructing a surrogate:

• An RBF interpolator is defined using the same formula in any number of dimensions and with any number of points.

• An RBF interpolator takes the prescribed values at the evaluated points.

• Evaluating an RBF interpolator takes little time.

• Adding a point to an existing interpolation takes relatively little time.

• Constructing an RBF interpolator involves solving an N-by-N linear system of equations, where N is the number of surrogate points. As Powell [1] showed, this system has a unique solution for many RBFs.

• `surrogateopt` uses a cubic RBF with a linear tail. This RBF minimizes a measure of bumpiness. See Gutmann [4].

The algorithm uses only initial points and random points in the first Construct Surrogate phase, and uses only random points in subsequent Construct Surrogate phases. In particular, the algorithm does not use any adaptive points from the Search for Minimum phase in this surrogate.

#### Search for Minimum Details

The solver searches for a minimum of the objective function by following a procedure that is related to local search. The solver initializes a scale for the search with the value 0.2. The scale is like a search region radius or the mesh size in a pattern search. The solver starts from the incumbent point, which is the point with the smallest objective function value since the last surrogate reset. The solver searches for a minimum of a merit function that relates to both the surrogate and to a distance from existing search points, to try to balance minimizing the surrogate and searching the space. See Merit Function Definition.

The solver adds hundreds or thousands of pseudorandom vectors with scaled length to the incumbent point to obtain sample points. These vectors have normal distributions, shifted and scaled by the bounds in each dimension, and multiplied by the scale. If necessary, the solver alters the sample points so that they stay within the bounds. The solver evaluates the merit function at the sample points, but not at any point within `options.MinSampleDistance` of a previously evaluated point. The point with the lowest merit function value is called the adaptive point. The solver evaluates the objective function value at the adaptive point, and updates the surrogate with this value. If the objective function value at the adaptive point is sufficiently lower than the incumbent value, then the solver deems the search successful and sets the adaptive point as the incumbent. Otherwise, the solver deems the search unsuccessful and does not change the incumbent.

The solver changes the scale when the first of these conditions is met:

• Three successful searches occur since the last scale change. In this case, the scale is doubled, up to a maximum scale length of `0.8` times the size of the box specified by the bounds.

• `max(5,nvar)` unsuccessful searches occur since the last scale change, where `nvar` is the number of problem variables. In this case, the scale is halved, down to a minimum scale length of `1e-5` times the size of the box specified by the bounds.

In this way, the random search eventually concentrates near an incumbent point that has a small objective function value. Then the solver geometrically reduces the scale toward the minimum scale length.

The solver does not evaluate the merit function at points within `options.MinSampleDistance` of an evaluated point (see Definitions for Surrogate Optimization). The solver switches from the Search for Minimum phase to a Construct Surrogate phase (in other words, performs a surrogate reset) when all sample points are within `MinSampleDistance` of evaluated points. Generally, this reset occurs after the solver reduces the scale so that all sample points are tightly clustered around the incumbent.

#### Merit Function Definition

The merit function fmerit(x) is a weighted combination of two terms:

• Scaled surrogate. Define smin as the minimum surrogate value among the sample points, smax as the maximum, and s(x) as the surrogate value at the point x. Then the scaled surrogate S(x) is

`$S\left(x\right)=\frac{s\left(x\right)-{s}_{\mathrm{min}}}{{s}_{\mathrm{max}}-{s}_{\mathrm{min}}}.$`

s(x) is nonnegative and is zero at points x that have minimal surrogate value among sample points.

• Scaled distance. Define xj, j = 1,…,k as the k evaluated points. Defiine dij as the distance from sample point i to evaluated point k. Set dmin = min(dij) and dmax = max(dij), where the minimum and maximum are taken over all i and j. The scaled distance D(x) is

`$D\left(x\right)=\frac{{d}_{\mathrm{max}}-d\left(x\right)}{{d}_{\mathrm{max}}-{d}_{\mathrm{min}}},$`

where d(x) is the minimum distance of the point x to an evaluated point. D(x) is nonnegative and is zero at points x that are maximally far from evaluated points. So, minimizing D(x) leads the algorithm to points that are far from evaluated points.

The merit function is a convex combination of the scaled surrogate and scaled distance. For a weight w with 0 < w < 1, the merit function is

`${f}_{\text{merit}}\left(x\right)=wS\left(x\right)+\left(1-w\right)D\left(x\right).$`

A large value of w gives importance to the surrogate values, causing the search to minimize the surrogate. A small value of w gives importance to points that are far from evaluated points, leading the search to new regions. During the Search for Minimum phase, the weight w cycles through these four values, as suggested by Regis and Shoemaker [2]: 0.3, 0.5, 0.7, and 0.95.

### Parallel `surrogateopt` Algorithm

The parallel `surrogateopt` algorithm differs from the serial algorithm as follows:

• The parallel algorithm maintains a queue of points on which to evaluate the objective function. This queue is 30% larger than the number of parallel workers, rounded up. The queue is larger than the number of workers to minimize the chance that a worker is idle because no point is available to evaluate.

• The scheduler takes points from the queue in a FIFO fashion and assigns them to workers as they become idle, asynchronously.

• When the algorithm switches between phases (Search for Minimum and Construct Surrogate), the existing points being evaluated remain in service, and any other points in the queue are flushed (discarded from the queue). So, generally, the number of random points that the algorithm creates for the Construct Surrogate phase is at most `options.MinSurrogatePoints + PoolSize`, where `PoolSize` is the number of parallel workers. Similarly, after a surrogate reset, the algorithm still has `PoolSize - 1` adaptive points that its workers are evaluating.

### Note

Currently, parallel surrogate optimization does not necessarily give reproducible results, due to the nonreproducibility of parallel timing, which can lead to different execution paths.

## References

[1] Powell, M. J. D. The Theory of Radial Basis Function Approximation in 1990. In Light, W. A. (editor), Advances in Numerical Analysis, Volume 2: Wavelets, Subdivision Algorithms, and Radial Basis Functions. Clarendon Press, 1992, pp. 105–210.

[2] Regis, R. G., and C. A. Shoemaker. A Stochastic Radial Basis Function Method for the Global Optimization of Expensive Functions. INFORMS J. Computing 19, 2007, pp. 497–509.

[3] Wang, Y., and C. A. Shoemaker. A General Stochastic Algorithm Framework for Minimizing Expensive Black Box Objective Functions Based on Surrogate Models and Sensitivity Analysis. arXiv:1410.6271v1 (2014). Available at https://arxiv.org/pdf/1410.6271.

[4] Gutmann, H.-M. A Radial Basis Function Method for Global Optimization. Journal of Global Optimization 19, March 2001, pp. 201–227.