Documentation

# ecmninit

Initial mean and covariance

## Syntax

```[Mean,Covariance] = ecmninit(Data,InitMethod)
```

## Arguments

 `Data` `NUMSAMPLES`-by-`NUMSERIES` matrix with `NUMSAMPLES` samples of a `NUMSERIES`-dimensional random vector. Missing values are indicated by `NaN`s. `InitMethod` (Optional) Character vector that identifies one of three defined initialization methods to compute initial estimates for the mean and covariance of the data. If `InitMethod` = `[]` or `''`, the default method `nanskip` is used. The initialization methods are `nanskip` — (Default) Skip all records with `NaN`s. `twostage` — Estimate mean. Fill `NaN`s with the mean. Then estimate the covariance.`diagonal` — Form a diagonal covariance.

## Description

`[Mean,Covariance] = ecmninit(Data,InitMethod)` creates initial mean and covariance estimates for the function `ecmnmle`. `Mean` is a `NUMSERIES`-by-`1` column vector estimate for the mean of `Data`. `Covariance` is a `NUMSERIES`-by-`NUMSERIES` matrix estimate for the covariance of `Data`.

## Algorithms

collapse all

### Model

The general model is

`$Z\sim N\left(Mean,\text{\hspace{0.17em}}Covariance\right),$`

where each row of `Data` is an observation of Z.

Each observation of Z is assumed to be iid (independent, identically distributed) multivariate normal, and missing values are assumed to be missing at random (MAR).

### Initialization Methods

This routine has three initialization methods that cover most cases, each with its advantages and disadvantages.

### nanskip

The `nanskip` method works well with small problems (fewer than 10 series or with monotone missing data patterns). It skips over any records with `NaN`s and estimates initial values from complete-data records only. This initialization method tends to yield fastest convergence of the ECM algorithm. This routine switches to the `twostage` method if it determines that significant numbers of records contain `NaN`.

### twostage

The `twostage` method is the best choice for large problems (more than 10 series). It estimates the mean for each series using all available data for each series. It then estimates the covariance matrix with missing values treated as equal to the mean rather than as `NaN`s. This initialization method is robust but tends to result in slower convergence of the ECM algorithm.

### diagonal

The `diagonal` method is a worst-case approach that deals with problematic data, such as disjoint series and excessive missing data (more than 33% missing data). Of the three initialization methods, this method causes the slowest convergence of the ECM algorithm.