One thing I like about AI tech is that the developing world is really eager to a...

embedding-shape · 2025-12-28T17:31:26 1766943086

> many use cases are free or affordable to them

Because the technology is so fast, efficient and easy to run locally themselves? Or because currently there are remote APIs/UIs that are heavily subsidized by VC money yet the companies behind them are yet to be profitable?

I agree that giving the developing world any ladders for catching up is a great thing, but I'm not sure this is that, it just happens to be that companies don't care about profit (yet) so things appear "free or affordable" to them, and when its gonna be time to make things realistic, we'll see how accessible it'll still be.

skeledrew · 2025-12-28T22:29:09 1766960949

I don't think there's much room for providers to make the tech substantially less affordable. In order to do that, the one(s) starting the cost increase will need to provide a distinctly better service than those that remain decently affordable. And somehow put a moat around that offering, which is essentially impossible with the amount of players in the field.

intended · 2025-12-29T06:38:41 1766990321

The number of players in the field is sufficient to coordinate prices effectively.

You can also introduce ads to the model.

Amazon eventually started selling Amazon versions of popular cheap items, I can see the GenAI platforms doing the same.

Meme bots for example.

ls612 · 2025-12-28T17:33:39 1766943219

Inference is probably profitable in a unit economics sense today, there have been multiple back of the envelope calculations this year talking about this. And with multiple high quality open weights models out there I see no reason why competition between hosting providers won't drive price towards marginal cost of inference.

Imustaskforhelp · 2025-12-28T18:14:09 1766945649

You are forgetting about how with the multiple high quality open weights models, we are gonna quickly/(already have?) reached the point where using completely local models will make sense.

If the writer of the (grandparent comment?)/ (the person who wrote about the philipines secretary is reading this), I would love it if you can do me a simple task and instead of having them use the SOTA models for the stuff for which they are using AI right now, they use an open source model (even an tiny to mid model) and see what happens.

> "My assistant in the phillipines has used it to substantially improve her communications, for instance."

So if they are using it for communications, Honestly even a small-mid model would be good for them.

Please let me know how this experiment goes. I might write about it and its just plain curiosity to me but I would honestly be 99% certain that the differences would be so negligible that using SOTA or even remotely hosted AI datacenter models wont make much sense but of course we can say nothing without empirical evidence which is why I also asked you to conduct my hypothesis.

embedding-shape · 2025-12-28T18:28:22 1766946502

> You are forgetting about how with the multiple high quality open weights models, we are gonna quickly/(already have?) reached the point where using completely local models will make sense.

I'm not, since I'm a heavy user of local models myself, and even with the beast of a card I work with locally daily (RTX Pro 6000), the LLMs you can run locally are basically toy models compared to the hosted ones. I think, if you haven't already, you need to give it a try yourself to see the difference. I didn't mention or address it, as it's basically irrelevant because of the context.

And besides that, how affordable how GPUs today in the developing world? Electricity costs? How to deal with thermals? Frequent black outs? And so on, many variables you seemingly haven't considered yet.

Best way of making the difference between hosted models and local modals is to run your own private benchmarks against both of them and compare. Been doing this for years, and still local models are nowhere near the hosted ones, sadly. I'm eager for the day to come, but it will still take a while.

seanmcdirmid · 2025-12-28T18:34:55 1766946895

I’ve got a Max M3 with 64 GB ram and can run more than just toy models, even if they are obviously less than hosted ones. Honestly, I think local LLMs are the future and we are just going to be doing hosted until hardware catches up (and now they have something to catch up to!).

embedding-shape · 2025-12-29T01:21:49 1766971309

> Honestly, I think local LLMs are the future and we are just going to be doing hosted

Same here, otherwise I wouldn't be investing in local hardware :) But I'd be lying if I said I think it's ready for that today. I don't think the hardware as much to catch up with, it's the software that has a bunch of low hanging fruits available for performance and resource usage, since every release seems to favor "time to paper" above all else.

seanmcdirmid · 2025-12-29T02:23:14 1766974994

There are lots of things you can do on local hardware already, and you don’t have to worry about safeguards or token limits. There are lots of crazy models, especially Chinese ones, that have a lot of capabilities and aren’t just there for academic papers.

embedding-shape · 2025-12-29T11:45:56 1767008756

Again, put those under test with your private benchmarks, then compare the results with hosted models.

I'm not saying it's completely useless, or that I don't think it won't be better in the future. What I am saying is that even the top "weights available" models today really don't come close to today's SOTA. This is very clear when you have benchmarks to get hard concrete numbers that aren't influenced by public benchmarking data.

seanmcdirmid · 2025-12-29T23:13:24 1767050004

> even the top "weights available" models today really don't come close to today's SOTA.

This is the statement thatI'm disagreeing with. They do come close, even if they are somehow less, it is a fixed distance away where the hosted models aren't more than a magnitude better. Hosted models are still better, just not incredibly so.

Imustaskforhelp · 2025-12-28T19:33:04 1766950384

I 100% agree with your comment. That yes, we should test the models on our own private benchmarks and there is no denying that local has a long way to go.

I was just proposing that local feels the most sustainable way for things to go, perhaps even an API Openrouter like things but you can read my other comment on how I found their finances to be loss/zero profit so its good while it lasts (on the AI bubble) if one person needs it but long term I feel like its prices are gonna rise whereas local would still remain stable (Also worth mentioning that there is no free lunch so I think the losses would be distributed to everybody in the form of the financial crisis caused by AI, I hope that the impacts of the financial crisis lessens because I am genuinely worried about the crisis more so at this point.

Agreed, I myself understand that right now using these AI bubble fuel money sponsored might make sense (look at my other comments where I went into the rabbithole on how most companies are losing/zero profitting while investing billions)

Although these aren't sustainable, the one idea where it makes sense is that we transition to a local model based (which yes I know are inefficient) but the inevitability in my opinion if the bubble bursts but there are definitely some steal deals nowadays if one wishes.

Also You may have understood me wrong in this comment (if so my apologies) in the sense that what I was mentioning was for the secretary use-case and not a company (using AI?)/selling Ai related services which need 24x7 access

One wouldn't have to worry about Blackouts because if your secretary's house is blacked out, lets just be honest that AI won't turn the magic lights on

Also the machines in our devices are beast. I am pretty sure that for basic communication tasks as the grandfather comment suggested, we are very likely that even the "toy" models as you say would be "good enough"

cm2012 · 2025-12-28T22:10:51 1766959851

A tiny model would definitely be good enough for her use cases.

Imustaskforhelp · 2025-12-28T22:15:58 1766960158

So yea, a tiny model can even run locally (perhaps even on the phone)can solve her use case so the moat for AI companies is close to zero (as expected)

This is what I was trying to say actually, thanks for responding.

That being said, the original point about Americans/Europeans does become a bit moot after this discovery because the fact is, I don't think that most are against small models but rather the SOTA models run in AI centric datacenters which they hate as it actively acts as a tax on them by increasing electricity rates etc. while taking workforce from them

A tiny model on the other doesn't really do all the above. I definitely feel like the concerns of American people are definitely valid for the AI datacenters though so I hope something can be done about it in a timely and helpful manner which brings real help to the average american.

embedding-shape · 2025-12-28T17:34:49 1766943289

Sounds sensible, and I agree. But even with you and me making those assumptions/guesses, I still wouldn't claim that current AI tech is making it "free or affordable to them", it's subsidized, cannot really make claims about how affordable or not it is, at least not yet.

lukeschlather · 2025-12-28T17:51:11 1766944271

We can be pretty confident that these services are not subsidized. There are dozens of companies offering these services. Pretty much every single company has published open-weights models that you can run yourself. These open models, you could make money selling them for the same prices Google Gemini costs, while renting on-demand GPU instances from Google Cloud. It actually seems very implausible that Google is losing money on their proprietary models hosted on their own infrastructure. And OpenAI knows they have to compete with Google, which owns its own chips, OpenAI isn't going to be selling things at a loss. They cannot win that fight no matter how much Saudi money they get.

embedding-shape · 2025-12-28T17:54:51 1766944491

Again, I agree that it sounds plausible, but it doesn't guarantee anything, and the lack of hard data usually indicates things aren't as confidently profitable as you believe. Otherwise the companies would be bragging about it.

Probably in the end it'll be profitable for the companies somehow, but exactly how or what the exact prices will be, I don't think anyone know at this point. That's why I'm reserving my "Developing countries can now affordably use AI too" for when that's reality, not based on guesses and assumptions.

lukeschlather · 2025-12-28T18:27:31 1766946451

Google publishes their profits quarterly, but they only do that because they are required to by law. They would prefer people assume they're offering these services at a loss so nobody attempts to compete with them.

But again, it's not a guess or assumption - you can run the latest DeepSeek model renting GPUs from a cloud provider, and it works, and it's affordable.

Imustaskforhelp · 2025-12-28T18:55:48 1766948148

I thought about it and here's my opinion:

There are two (three technically) ways that AI can be used.

> 1. renting gpu instances per minute from (you mention Google cloud) but I feel like some other providers can be cheaper too since new companies are usually cheaper, We get the lowendhosting of AI nowadays is usually via a marketplace-like thing (vast,runpod,tensordock)

Now vast offers serverless per minute AI models so checking it for something like https://vast.ai/model/deepseek-v3.2-exp or even glm 3.6 basically every of these turns out to be $.30 cents/minute or 18$ per hour

As an example GLM 4.6/ (now 4.7) have a YEARLY pricing of around 30 bucks iirc so now compare the immense difference in pricing

2. Using something like openrouter-based pricing :- Then we are basically on the same model of pricing similar to Google Cloud.

Of course AI models are reaching frontier and I am cheering for them but I feel like long term/even short term, these are still pretty expensive (even something like openrouter imo)

Someone please do genuine maths about this and I can be wrong, I usually am but I expect a 2-3x price (conservative side of things) increase if things arent subsidized

These are probably 10s of billions of dollars worth of gpu's so I assume that they would be barely profitable on the current rate but they get around 100s of billions in some cases worth of tokens generations so they can probably work via the 3rd use case I mention

Now coming to the third point which I assume is related to the 2nd/1st is that usually, the companies providing these GPU computes provide such compute, usually they can make money via providing by large term contracts.

Even huggingface provides consulting services which I think is the biggest profit to them and Another big contender can probably be European GPU compute providers who can provide a layer of safety or privacy for EU companies.

Now, looks like I had to go to reddit to find some more info but (https://www.reddit.com/r/LocalLLaMA/comments/1msqr0y/basical...), checking appenz's comment which I might add here (the relevant parts)

The large labs (OpenAI, Anthropic) and Hyperscalers (Google, Meta) currently are not trying to be profitable with AI as they are trying to capture market share. They may not even try to have positive gross margins, although the massive scale limits how much they can use per inference operation.

Pure inference hosters (Together, Fireworks etc.) have less capital and are probably close to zero gross margins.

There are a few things that make all of this more complicated to account for. How do you depreciate GPUs (I have seen 3 years to 8 years), how do you allocate cost if you do inference during the day and train at night etc.

The challenge with doing this yourself is that the market is extremely competitive. You need massive scale (as parallelism massively reduces cost), you need to be very good in negotiating cheap compute capacity and you need to be cost-effective in your G2M.

Opinions are my own, and none of this is based on non-public information.

So basically all of these are probably running in zero/net negative turns and they require billions of $'s to be spent and virtually there isn't any moat/lock-in (and neither there has to be)

TLDR: no company right now is sustainable

The only use case I can see is probably consulting but that will go as https://www.investopedia.com/why-ai-companies-struggle-finan...

So I guess the only reasonable business feels to me is private AI for large businesses who genuinely need it for their business (once again the MIT study applies) but that usually wouldn't apply to us normal grade consumers anyway and would be actually really expensive but still private and would be so far off from us normal people.

TLDR: The only ones making money are/ are gonna be B2B but even those are gonna dwindle if the AI bubble bursts because imagine an large business trying to explain why its gonna use AI if 1) the MIT study shows its unprofitable and 2) the fear around using AI etc. and all the financial consequences that the bubble's explosion might cause

So that all being said, I doubt it. I think that these prices are only till the bubble lasts which is only as strong as its weakest link which is openAI right now with trillions promised and a net lose making company whose CEO said that AI market is in a bubble and whose CFO openly floats the idea that OpenAI should be bailed out by the US govt if need be

So yeah..... Honestly Even local grade gpu's are expensive but with the innovations of open weights models, I feel like they would be the way to go for 90% of basic use cases being run inside them and probably there are very few cases of moat (and I doubt the moat existing in the first place)

cm2012 · 2025-12-28T18:51:17 1766947877

You can get 90% of the value with old LLM models that cost the companies very little to run.