This is correct - it was all on-device, with security guarantees that were instantly proven incorrect. Microsoft withdrew Recall, then brought it back with a newer, more secure implementation that was also proven insecure.
It also claimed that it wasn't going to record sensitive information but it did, to the point where some apps, like Signal, used available Windows APIs to set DRM flags on their windows so that Windows wouldn't capture those regions at all.
What Microsoft could have offered is an easy-to-implement API for application developers to opt into (but users can opt out of), and a blanket recall-esque toggle that users can apply to applications without explicit support. Applications like Firefox or Chrome could hook into the API to provide page content to the API along with more metadata than a simple screenshot could provide, while at the same time not providing that data when sensitive fields/data is on the page (and possibly providing ways for the HTML to define a 'secure' area that shouldn't be indexed or captured, useful in lots of other circumstances).
But, as with everything AI, they don't want users to want it; they want users to use it regardless of whether or not they want it. This is the same reason they forced Copilot into everyone's Office 365 plans and then upped the price unless you tried to cancel; they have to justify the billions they're spending and forcing the numbers to go up is the only way to do that.
I have to wonder what edge AI would look like on a laptop. Little super mini Nvidia Jetson? How much added cost? How much more weight for the second and third batteries? And the fourth and fifth batteries to be able to unplug for more than a few minutes?
They're called NPUs and all recent CPUs from Intel, AMD, or Apple have them. They're actually reasonably power efficient. All flagship smartphones have them, as well as several models down the line as well.
IIRC linux drivers are pretty far behind, because no one who works on linux stuff is particularly interested in running personal info like screenshots or mic captures through a model and uploading the telemetry. While in general I get annoyed when my drivers suck, in this particular case I don't care.
Though storing the data locally still could make getting compromised by a targeted attack more dangerous.