The scale of word embeddings (eg. Distance from 0) is mainly measuring how commo...

liminal · on March 12, 2024

> If you normalize your vectors, cosine similarity is the same as Euclidean distance.

If you normalize your vectors, cosine similarity is the same as dot product. Euclidean distance is still different.

VHRanger · on March 12, 2024

Oh, thanks for the correction.

If all the vectors are on the unit ball, then cosine = dot product. But then the dot product is a linear transformation away from the euclidean distance:

https://math.stackexchange.com/questions/1236465/euclidean-d...

If you're using it in a machine learning model, things that are one linear transform away are more or less the same (might need more parameters/layers/etc.)

If you're using it for classical statistics uses (analytics), right, they're not equivalent and it would be good to remember this distinction.

gbjw · on March 12, 2024

To be very explicit, if |x| = |y| = 1, we have |x - y|^2 = |x|^2 - 2xy + |y|^2 = 2 - 2xy = 2 - 2* cos(th). So they are not identical but minimizing the Euclidian distance of two unit vectors is the same as maximizing the cosine similarity.