Oh interesting. I never would have thought AI would be used for this. Does it also find things like the meta "revised" tag or anything like that? Doing some Googling it seems like officially it should be "revision", but seems like it's very common to use "revised"
But a few websites set their updated date to the current date which was annoying, maybe to rank better in Google? And some people (including me) only mention the update time in the page text content.
I've used GPT to parse human formatted dates in another project too, it's quite reliable if you validate the output timestamp. And relatively cheap too if you only pass in the first part of the page text.
I can see how it's a tricky problem. I wish html had more structure here (and people followed the structure, a whole other problem...). FWIW, my page has a "last updated" date on its now page but comes up as 1969 in aboutideasnow.