1. I don't know what kind of world you live in to think that USD 3500 is "less than one week of a developer salary for most companies." I think you really just mean FAANG (or whatever the current acronym is) or potentially SV / offices in cities with very high COL.
2. The problem is scaling. To support billions of search queries you would have to invest in a lot more than a single GPU. You also wouldn't only need a single van, but once you take scaling into account even at $3500 the GPUs will be much more expensive.
That said, costs will come down eventually. The question in my mind is whether OpenAI (who already has the hardware resources and backed by Microsoft funding to boot) will be able to dominate the market to the extent that Google can't make a comeback by the time they're able to scale.
> 1. I don't know what kind of world you live in to think that USD 3500 is "less than one week of a developer salary for most companies." I think you really just mean FAANG (or whatever the current acronym is) or potentially SV / offices in cities with very high COL.
I live in the real world, at a small company with <100 employees, a thousand miles away from SV.
$3200 * 52 == $180k a year, and gives $120k salary and $60k for taxes, social security, insurance, and other benefits, which isn't nearly FAANG level.
Even if you cut it in half and say it's 2 weeks of dev salary, or 3 weeks after taxes, it's not unreasonable as a business expense. It's less than a single license for some CAD software.
> 2. The problem is scaling. To support billions of search queries you would have to invest in a lot more than a single GPU. You also wouldn't only need a single van, but once you take scaling into account even at $3500 the GPUs will be much more expensive.
Sure, but you don't start out with a fleet of vans, and you wouldn't start out with a "fleet" of GPUs. A smart business would start small and use their income to grow.
GP lives in an company world. The cost of a developer to a company is the developer's salary as stated in the contract, plus some taxes, health insurance, pension, whatever, plus the office rent for the developer's desk/office, plus the hardware used, plus a fraction of the cost of HR staff and offices, cleaning staff, lunch staff... it adds up. $3500 isn't a lot for a week.
Most of these items are paid for by the company, and most people would not consider the separate salary of the janitorial or HR staff to be part of their own salary.
I agree, most people wouldn't. This leads to a lot of misunderstandings, when some people think in terms of what they earn and others in terms of what the same people cost their employers.
So you get situations where someone names a number and someone else reacts by thinking it's horribly, unrealistically high: The former person thinks in employer terms, the latter in employee terms.
1 - Yes, I agree on this, but even so, most developers already are investing in SOTA GPU's for other reason (so not as much of a barrier as purported)
2 - Scaling is not a problem in other industries? If you want to scale your food truck, you will need more food trucks, this doesn't seem to really do anything for your point.
GGML and GPTQ have already revolutionised the situation, and now there are tiny models with insane quality as well, that can run on a conventional CPU.
I don't think you have any idea what is happening around you, and this is not me being nasty, just go and take a look at how exponential this development is and you will realise that you need to get in on it before its too late.
You seem to be in a very particular bubble if you think most developers can trivially afford high end GPUs and are already investing in SOTA GPUs. I know a lot of devs from a wide spectrum of industries and regions and I can think of only one person who might be in your suggested demographic
Perhaps I should clarify, that when I say SOTA GPU, I mean, rtx 3060 (midrange), which has 12gb vram, and is a good starting point to climb into the LLM market.
I have been playing with LLM's for months now, and for large periods of time had no access to GPU due to daily scheduled rolling blackouts in our country.
Even so, I am able to produce insane results locally with open source efforts on my RTX3060, and now I am starting to feel confident enough that I could take this to the next level by either using cloud (computerender.com for images) or something like vast.ai to run my inference (or even training if I spend more time learning). And if that goes well I will feel confident going to the next step, which is getting an actual SOTA GPU.
But that will only happen once I have gained sufficient confidence that the investment will be worthwhile.
Regardless, apologies for suggesting the RTX3060 is SOTA, but to me in a 3rd World Country, being able to run vicuna13b entirely on my 3060 with reasonable inference rates is revolutionary.
2. The problem is scaling. To support billions of search queries you would have to invest in a lot more than a single GPU. You also wouldn't only need a single van, but once you take scaling into account even at $3500 the GPUs will be much more expensive.
That said, costs will come down eventually. The question in my mind is whether OpenAI (who already has the hardware resources and backed by Microsoft funding to boot) will be able to dominate the market to the extent that Google can't make a comeback by the time they're able to scale.