I’ve been doing this in my private codebase. When copilot hallucinates a function, I just go and write the thing. It’s usually a good idea, and it will re-hallucinate the same function independently in another file.
The only way this is useful in the context of code is if:
* The LLMs have a sufficient "understanding" of the request and of how to write code to fulfill the request
* Have a way to validate the suggestion by actually executing the code (at least during training) and inspecting the output
From what I've seen we are still far away from that, Copilot and GPT-4 seem heavily reliant on very well-commented code and on sources like Stackoverflow
If I were training a code model I'd take a snippet of code, have the existing LLM explain it. Then use the explanation and the snippet for the test data.