Belief Network = DAG + CPTs, but CPTs are not always known. If CPTs cannot be given by the experts, we may want to learn them from examples.

#### Maximum Likelihood Estimation (MLE)

- Simplest form of BN in learning
- Choose (estimate) model (DAG + CPTs) to maximize P(observed data|DAG+CPTs)

#### Case I

Fixed DAG, Discrete nodes, Lookup CPTs, Complete data

#### Properties

- Asymptotically correct
- Problematic in non-asymptotic regine (sparse data)
- 0 if count of pair is 0
- undefined if count of parent is 0

#### Case II

Fixed DAG, Parametric CPTs, Complete data

II-A Gaussian CPT (Linear Regression)

Ill-posed when:

- Input dimensionality exceeds number of examples, d > T
- Inputs not in general position
- option is to go for minimum norm solution

II-B Sigmoid CPT (Logistic Regression)

#### Case III

Fixed DAG, Discrete nodes, Lookup CPTs, Incomplete data

EM Algorithm (Next post)