Approximate Inference Simplified
• 1 min read 1 min
What I worked on
Tried to unpack the meaning of “efficient approximate inference” in probabilistic models with continuous latent variables and intractable posteriors. Dug into VAEs, directed models, i.i.d. datasets, and concepts like maximum likelihood and variational inference. Also looked at ELBO, entropy, marginal likelihood, and related math.
What I noticed
- Directed models describe conditional relationships between variables
- i.i.d. data assumption is common in probabilistic setups
- MCMC and EM are used to approximate intractable posteriors
- Continuous latent variables represent learned compressed info
- KL divergence measures difference between two distributions
- ELBO acts as a lower bound on the likelihood
- Variance in N(0, I_KxK) affects the curve radius
- Maximum likelihood and maximum entropy principles are linked
”Aha” Moment
n/a
What still feels messy
Still unclear how variance affects learning and reconstruction. Need deeper intuition on entropy, marginal likelihood, and how these mathematically connect to ELBO in real models.
Next step
Revisit the VAE paper equations and visualize how the posterior and prior interact using sample data.