More

ddp26 · 2026-05-28T21:12:27 1780002747

It is refreshing but perhaps actually not warranted this time?

I mostly study web research, and Opus 4.7 was a regression on BrowseComp compared to Opus 4.6, which has been born out by my usage.

Opus 4.8 is now much better than either 4.7 or 4.6, and having it search the web is one of the primary use cases of chatbots.

ddp26 · 2026-05-28T15:25:37 1779981937

What's a definition of AGI you would use, for either time, tasks, value, or job descriptions?

hyperpape · 2026-05-28T15:52:20 1779983540

No one has to provide a definition to argue that your definition is inadequate.

ddp26 · 2026-05-28T15:23:19 1779981799

I linked elsewhere in a comment, Metaculus has AGI forecasts.

You can also now use AI forecasters like FutureSearch [1] (disclaimer: I work there), which are competitive with the best humans / teams of humans. And since you aren't depending on a human crowd, you can ask any variation of AGI questions with any definition, even ask conditional questions.

[1] https://futuresearch.ai/app

ddp26 · 2026-05-28T15:00:30 1779980430

Thank you! Tok me a few hours, without Claude Code I don't think I would have even attempted this.

ddp26 · 2026-05-28T14:59:40 1779980380

It's been a big problem for a while. The big Metaculus question about AGI has depends on the game "Montezuma's revenge" (!), and there have been many debates about this going back to at least 2020: https://www.metaculus.com/questions/3479/date-weakly-general...

ddp26 · 2026-05-28T14:57:11 1779980231

Author here, I agree, I'd be happy if admins want to change the title of this submission to the title of the piece.

ddp26 · 2026-05-28T14:56:03 1779980163

Author here, I drew on this from AI 2027. Yes, a very-expensive AGI, e.g. $1 million / day to simulate a smart human, would be a huge deal. But it would have meaningfully different effects than a cheap one.

Here's one definition AI 2027 used [1]: "Superhuman coder (SC): An AI system for which the company could run with 5% of their compute budget 30x as many agents as they have human research engineers..."

[1] https://ai-2027.com/research/timelines-forecast

yCombLinks · 2026-05-28T15:00:16 1779980416

I've got no problem with your concept, and even think it's useful. I just don't think that concept and AGI are the same thing. Economically useful has no relation to what has been called AGI before.

wat10000 · 2026-05-28T15:31:39 1779982299

I take it as a sign of how close it is (or how close people think it is). When AGI was SFnal magic, merely having it at all is a fascinating concept. Now that (people think) it's on the horizon, there are more practical concerns, like the fact that running these things might cost a substantial amount of money.

ddp26 · 2026-05-26T18:32:18 1779820338

I see a lot of comments like this is the blocking of prediction markets about politics, war, etc.

It's important to remember that ~80% of activity Polymarket and ~90% of Kalshi, by volume, are sports. These are effectively sports betting websites with prediction markets on the side.

ddp26 · 2026-05-26T13:45:20 1779803120

Snake oil is a bit strong, no? I would agree that the burden of proof is on multi-agent systems to show they are outperforming single-agent systems.

On my own evals I have seen this, though the improvement may not have been worth the extra cost.

akrylov · 2026-05-26T14:14:15 1779804855

The name of the thread is provocative, but the premise is valid - I have yet to see anything produced by multi-agent frameworks (langchain or bespoke works) that produced value. Anthropic pushes vibeCAD, vibeVFX, vibePowerPoint but the results are underwhelming. The real value is in codegeneration and autonomous infra, research.

ddp26 · 2026-05-26T13:43:57 1779803037

I like this, though it does leave me feeling more nervous when I really don't know how I'd solve the problem, still requires trust.