The EM Algorithm and its Extensions
Since it is simple and stable,
the EM algorithm
(Dempster, Laird, and Rubin, 1977, JRSS-B)
has been widely used to fit models from
incomplete data.
Our current research program
in this area includes the following.
- Acceleration
- The PX-EM
algorithm (Chuanhai Liu,
Donald B. Rubin, and
Ying Nian Wu, 1998),
shares the simplicity and stability of ordinary EM but
is often much faster. The intuitive idea is to use a covariance
adjustment to correct the M step, capitalizing on
extra information captured in the imputed complete data. This
is accomplished by parameter expansion; we expand
the complete-data model while
preserving the observed-data model and use the expanded
complete-data model in the EM algorithm.
- Supplementation
-
Computing the Information Matrix
from conditional information via normal approximation (Liu, 1998).
The basic idea is to approximate the likelihood function by a normal density
when maximum likelihood estimates are assumed to be approximately normally
distributed. The method uses two facts: the information
for a one-dimensional parameter can be computed when the loglikelihood
is approximately quadratic over a range that corresponds to
a small positive confidence interval; and the covariance matrix of
a normal distribution can be obtained from a set of one-dimensional
conditional distributions whose sample spaces
span the sample space of the joint distribution.
- Application
- EM can be used for maximum likelihood estimation of
many models, such as multivariate normal,
multivariate t,
mixed-effects,
general location,
factor analysis, and
mixture
models. For example,
the EM algorithm has been used in understanding and
modelling the relationship among questions/attributes at company level in
Customer Value Analysis (CVA).
William S. Cleveland and Chuanhai Liu are working on a generalized version
of the time series model that has been used as a component in modeling CVA.
We implemented the EM algorithm for maximum likelihood estimation
of this class of time series models.
- As a supplementary tool for
Markov chain Monte Carlo (MCMC)
methods for
Bayesian computation.
The postscript files of some
our current papers are also available.
Back to: [projects]
[statistics homepage]