Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You make it sound silly, but it does seem to make good business sense to me. I'd argue that it seems like they learned from The Bitter Lesson[0] and instead of trying to manually solve things with today's technology, are relying on the exponential improvement in general purpose AI and building something that would utilize that.

On a somewhat related note, I'm reminded of the game Crysis, which was developed with a custom engine (CryEngine 2) which was famously too demanding to run at high settings on then-existing hardware (in 2007). They bet on the likes of Nvidia to continue rapidly improving the tech, and they were absolutely right, as it was a massive success.

[0] http://www.incompleteideas.net/IncIdeas/BitterLesson.html



Yeah it’s a real valid business strategy.

It’s also silly.

The crytek case is similar, but there’s a big difference. Crytek was betting that performance of hardware will increase. A lot of AI startups are betting that future LLMs will have an new different, fairly vague, capabilities (AGI, whatever that actually means)


I'd also add that what Crysis did was pretty typical at the time. It was an era when new computers were a bit dated in a few months, and then obsolete in a couple of years. Carmack/ID Software/Doom was a more typical example of this, as they did it repeatedly and regularly, frequently in collaboration with the hardware companies of the time. But there was near zero uncertainty. There was a clear path to the goal down to the point of exact expected specs.

With LLMs there's not not only no clear path to the goal, but there's every reason to think that such a path may not exist. In literally every domain neural networks have been utilized in you reach asymptotic level diminishing returns. Truly full self driving vehicles are just the latest example. They're just as far away now as they were years ago. If anything they now seem much further away because years ago many of us expected the exponential progress to continue, meaning that full self driving was just around the corner. We now have the wisdom to understand that, at the minimum, that's one rather elusive corner.


"In literally every domain neural networks have been utilized in you reach asymptotic level diminishing returns."

Is that true, though? I think of "grokking", where long training runs result in huge shifts in generalization, only with orders of magnitude more training after training error seemed to be asymptotically low.

This'd suggest both that there's not that asymptotic limit you refer to - something very important is happening much later - and that there are potentially some important paths to generalization on lower amounts of training data that we haven't yet figured out.


I think training error itself has diminishing returns as a metric for LLM usefulness.

A lower error, after a certain point, does not suggest better responses




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: