I mean it was just a thought I had. May be a "solution in search of a problem". I generate those a lot! haha. But it seems to me like having some sort of canonical set of training data and a canonical LLM architecture, we'd end up able to generate consistent embeddings of course, but I'm just not sure what the use cases are.