Spin up ollama and run some inference on your 5-year-old intel macbook. You won't see 4000x performance improvement (because performance is bottlenecked outside of the GPU), but you might be in the right order of magnitude.
Not possible given the anemic memory bandwidth [1]... you can scale up the compute all you want but if the memory doesn't scale up as well you're not going to see anywhere near those numbers.
[1] The memory bandwidth is fine for CPU workloads, but not for GPU / NN workloads.
Spin up ollama and run some inference on your 5-year-old intel macbook. You won't see 4000x performance improvement (because performance is bottlenecked outside of the GPU), but you might be in the right order of magnitude.