Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It ironically seems like a very similiar market to internet search. There was no moat there either, other than the capital needed to bankroll a better search engine. A lot of these AI companies will eventually fail (not because their models will be significantly worse but because of failure to commercialize), the market will consolidate with only a couple of players (maybe two in the US, one in China and maybe one in Russia). And once that happens the idea of raising enough capital and building a competitive AI company will seem impossible. Exactly like what transpired with internet search after Google won most of the market.


Oof, no -- it's quite the opposite, much to the likely collapse of google in the future.

Holding exabytes of data to be processed on commodity hardware to enable internet-wide search, all the while it was man-in-the-middle monetised by an ad-business, created tremendous moats. Entering that market is limited to tech multinationals, and they have to deliver a much superior experience to overcome them. To perform a google search you need google-sized data-centres.

Here we have exactly the opposite dynamics: high-quality search results (/prompt-answers) are as-of-now incredibly commodotized, and accessible at-inferecence-time to any person who has $25k. That's going to be <= 10k soon.

And innovation in the space has also gone from needing >1Bn to <=50Mil

A higher quality search experience is available now at absolutely trivial prices.


That's only because LLMs haven't been a target until now. Search worked great back before everything became algorithmically optimised to high hell. Over time, the quality of information degrades because as metric manipulation becomes more effective, every quality signal becomes weaker.

Right now, automated knowledge gathering absolutely wipes the floor with automated bias. Cloudflare has an AI blocker which still can't stop residential proxies with suitably configured crawlers. The technology for LLM crawling/training is still mostly unknown, even to engineers, so no SEO wranglers have been able to game training data filters successfully. All LLMs have access to the same dataset - the internet.

Once you:

1. Publicly reveal how training data is pre-processed 2. Roll out a reputation score that makes it hard for bots to operate 3. Begin training on non-public data, such as synthetic datasets 4. Give manipulated data a few more years to accumulate and find its way into training data

It becomes a lot harder.


Google did absolutely have a moat on internet search. It wasn’t just about bankrolling a an alternative as Microsoft proved time and time again.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: