Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What are good distance metrics applied to latent embeddings as part of a diversity loss function to prevent model collapse?


hell if I know!! Sorry. I've used the vegan package for some analyses, but I've mostly used Manhattan and cosine metrics. I just wanted to bring up the idea that there are a lot of metrics out there that may not be generally appreciated.

Claude Opus says "There are a few good distance metrics commonly used with latent embeddings to promote diversity and prevent model collapse:

1. Euclidean Distance (L2 Distance) 2. Cosine distance 3. Kullback-Leibler (KL) Divergence: KL divergence quantifies how much one probability distribution differs from another. It can be used to measure the difference between the distributions of latent embeddings. Minimizing KL divergence as a diversity loss would encourage the embedding distribution to be more uniform. 4. Maximum Mean Discrepancy (MMD): MMD measures the difference between two distributions by comparing their moments (mean, variance, etc.) in a reproducing kernel Hilbert space. It's useful for comparing high-dimensional distributions like those of embeddings. MMD loss promotes diversity by penalizing embeddings that are too clustered together. 5. Gaussian Annulus Loss: This loss function encourages embeddings to lie within an annulus (ring) in the latent space defined by two Gaussian distributions. It promotes uniformity in the embedding norms while allowing angular diversity. This can be effective at preventing collapse to a single point. But I haven't checked for hallucinations.


To add further: https://cran.r-project.org/web/packages/vegan/vignettes/dive... Vegan package is very much into methods to assess diversity in ecologies.

Beta diversity is one metric for examining diversity, define as the ratio between regional and local species diversity. https://en.wikipedia.org/wiki/Beta_diversity


I had no idea that diversity loss function was a topic in deep learning. I admit, I'm a bit fascinated, as a neuroimaging scientist.


Have a look at section 3.2 of the Wav2vec2 paper:

https://arxiv.org/pdf/2006.11477.pdf




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: