[Probability] Maximum Likelihood

Belief Network = DAG + CPTs, but CPTs are not always known. If CPTs cannot be given by the experts, we may want to learn them from examples.

Maximum Likelihood Estimation (MLE)

  • Simplest form of BN in learning
  • Choose (estimate) model (DAG + CPTs) to maximize P(observed data|DAG+CPTs)

Case I

Fixed DAG, Discrete nodes, Lookup CPTs, Complete data

Screen Shot 2018-10-30 at 3.29.10 AM.png
Screen Shot 2018-10-30 at 3.24.42 AM


  • Asymptotically correct
  • Problematic in non-asymptotic regine (sparse data)
    • 0 if count of pair is 0
    • undefined if count of parent is 0

Case II

Fixed DAG, Parametric CPTs, Complete data

II-A Gaussian CPT (Linear Regression)

Screen Shot 2018-11-06 at 4.49.17 AM
Screen Shot 2018-11-06 at 4.49.27 AM

Ill-posed when:

  • Input dimensionality exceeds number of examples, d > T
  • Inputs not in general position
  • option is to go for minimum norm solution

II-B Sigmoid CPT (Logistic Regression)

Case III

Fixed DAG, Discrete nodes, Lookup CPTs, Incomplete data

EM Algorithm (Next post)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.