That's confusing basic arithmetic as a user feature and as an implementation requirement.
I guarantee that computer vision and email clients both use basic arithmetic in implementation. And it would be trivially easy to bolt a calculator into an email app, because the languages used to write email apps include math features.
That's not true of LLMs. There's math at the bottom of the stack. But LLMs run as a separate closed and opaque application of a unique and self-contained type, which isn't easily extensible.
They don't include hooks into math features on the GPUs, and there's no easy way to add hooks.
If you want math, you need a separate tool call to conventional code.
IMO testing LLMs as if they "should" be able to do arithmetic is bizarre. They can't. They're not designed to. And even if they did, they'd be ridiculously inefficient at it.
> Pretty sure the only thing computer vision does is math.
That is only marginally less pedantic than saying that the only thing computer vision does is run discrete electrical signals through billions of transistors.
Yes, everything that a computer does, it does using math. This does not imply that things running on the computer can do basic arithmetic tasks for the user.