Ideally you train a model right to begin with, and no fine tuning is necessary. ...

Ideally you train a model right to begin with, and no fine tuning is necessary.

However, sometimes you can't do that. For example, perhaps you want your model to always talk like a pirate, but you don't have billions of words spoken like a pirate to train on.

So the next best thing is to train a model on all english text (which you have lots of), and then finetune on your smaller dataset of pirate speech.

Finetuning is simply more training, but with a different dataset and often a different learning rate.

Typically, finetuning uses far far far less data and compute, and can be done by individuals with a home PC, whereas training a large language model from scratch is in the $1M - $1B range.