- GPT style language models end up internally implementing a mini "neural network training algorithm" (gradient descent fine-tuning for given examples): https://arxiv.org/abs/2212.10559
- GPT style language models end up internally implementing a mini "neural network training algorithm" (gradient descent fine-tuning for given examples): https://arxiv.org/abs/2212.10559