"Isn't the model size a much larger one?" yap
It will probably be different, and systems will have to download the weights and network model, as new models come in, I don't think that we will have a fixed model with fixed weights, the evolution is too fast. Decoding will take place using the AI chip on the device aka "AI accelerator"