EM Algorithm
- For incomplete data case of Maximum Likelihood
- Auxiliary function
- No learning rate
- Monotonic convergence (each iteration improves L)
- Comparison with other numerical methods
- Gradient Descent: Asymptotic but not monotonic convergence
- Newton’s Method: Monotonic convergence, but highly unstable
Problem

Insight
We can find an auxiliary function that guarantees monotonic convergence


Derivation


Algorithm
- Initialize CPT to any value (non-zero)
- Repeat following until convergence
