Belief Network = DAG + CPTs, but CPTs are not always known. If CPTs cannot be given by the experts, we may want to learn them from examples.
Maximum Likelihood Estimation (MLE)
- Simplest form of BN in learning
- Choose (estimate) model (DAG + CPTs) to maximize P(observed data|DAG+CPTs)
Case I
Fixed DAG, Discrete nodes, Lookup CPTs, Complete data



Properties
- Asymptotically correct
- Problematic in non-asymptotic regine (sparse data)
- 0 if count of pair is 0
- undefined if count of parent is 0
Case II
Fixed DAG, Parametric CPTs, Complete data
II-A Gaussian CPT (Linear Regression)


Ill-posed when:
- Input dimensionality exceeds number of examples, d > T
- Inputs not in general position
- option is to go for minimum norm solution
II-B Sigmoid CPT (Logistic Regression)

Case III
Fixed DAG, Discrete nodes, Lookup CPTs, Incomplete data
EM Algorithm (Next post)