Zooming big blobs of text at 60fps without caching remains an open problem as it’s necessary to use a GPU shader and all of the approaches make one or another lacklustre trade-offs.
It's simply that drawing a non trivial amount of(nice looking) generic text (accurately) seems simple but actually takes a lot of logic and a lot of processing time relative to a rendering deadline. There are a lot of reasons for this but largely it comes down to having to do a lot of individual rendering every time something changes and when one thing needs to be rerendered (e.g. the user zoomed) it likely means everything needs you have to bulk render everything again in the next 16 or fewer ms.
https://gankra.github.io/blah/text-hates-you/ has a decent description of many of the problems seen with rendering text in the browser setting (which is on the more complex side of text rendering).
Zooming big blobs of text at 60fps without caching remains an open problem as it’s necessary to use a GPU shader and all of the approaches make one or another lacklustre trade-offs.