It's not the first similar claim, there was one that large CPU's can beat GPUs by creating a very sparse matrix using spatial hashing. Preserving the gradient but finding it with vastly less computation by simply not multiplying elements that don't have much or any impact.
This is literally the same research group, and the paper you link to is referenced in the article as being the predecessor 2020 publication to this announcement of a forthcoming 2021 paper expanding on the 2020 one.
So yes and no - if you are correct and that paper is indeed the first claim (not up on the literature), this one is also the same first claim, or at least, an extension to it
EDIT: orginial work afaik https://arxiv.org/abs/1903.03129