High Val Loss Diagnosis
• 1 min read 1 min
What I worked on
Investigated why my validation loss was extremely high in a VAE. Looked at BCE + KLD loss functions, dataset size, normalization, and KLD values across different latent dimensions.
What I noticed
- BCE with reduction=‘sum’ can produce very large loss values
- KLD can dominate and cause latent collapse
- Smaller latent dimensions reduce capacity but help with clarity
- High KLD pushes reconstruction loss higher
- The trend in KLD seems more meaningful than absolute value
- Visualization of latent space helps identify unused dimensions
”Aha” Moment
The KLD term acts as a regularizer but can overpower the reconstruction term, leading to under use of latent space.
What still feels messy
How to balance KLD and reconstruction loss without guessing β. Unsure what target KLD range should actually mean for a given latent size.
Next step
Plot latent activations using PCA and UMAP to check for collapse and adjust β weight in the loss.