Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Presumably this operation:

  def average_embeddings(embeddings):
    return [
      sum([embedding[x] for embedding in embeddings]) / len(embeddings)
      for x in range(len(embeddings))
    ]
Just finding the mid-point in N-dimensional space between all points. Although I believe OpenAI says these are directions rather than locations. Averaging makes sure you don't accidentally pick a target vector that happens to be on the fringes of your desired embedding space.

So if you're querying a document for predatory birds and you are using the string "Eagles" as your target you will be accidentally digging into embedding space for sports teams (go birds). But if your target is `average_embeddings([eagles_embedding, hawks_embedding, falcons_embedding])` you'll end up with less noise.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: