Skip to main content

High Val Loss Diagnosis

1 min

What I worked on

Investigated why my validation loss was extremely high in a VAE. Looked at BCE + KLD loss functions, dataset size, normalization, and KLD values across different latent dimensions.

What I noticed

  • BCE with reduction=‘sum’ can produce very large loss values
  • KLD can dominate and cause latent collapse
  • Smaller latent dimensions reduce capacity but help with clarity
  • High KLD pushes reconstruction loss higher
  • The trend in KLD seems more meaningful than absolute value
  • Visualization of latent space helps identify unused dimensions

”Aha” Moment

The KLD term acts as a regularizer but can overpower the reconstruction term, leading to under use of latent space.

What still feels messy

How to balance KLD and reconstruction loss without guessing β. Unsure what target KLD range should actually mean for a given latent size.

Next step

Plot latent activations using PCA and UMAP to check for collapse and adjust β weight in the loss.