This just isn't accurate, on the overwhelming majority of real-world tasks (>90%...

		maeil 11 months ago \| parent \| context \| favorite \| on: FrontierMath was funded by OpenAI This just isn't accurate, on the overwhelming majority of real-world tasks (>90%) 3.5 Sonnet beats 4o. FWIW I've spoken with a friend who's at OpenAI and they fully agree in private.