Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Seems like they are quite startled with LLama 2 and Code Llama, and how its rapid adoption is accelerating the AI race to zero. Why have this when Llama 2 and Code Llama exists and brings the cost close to $0?

This sound like a huge waste of money for something that should just be completely on-device or self-hosted if you don't trust cloud-based AI models like ChatGPT Enterprise and want it all private and low cost.

But either way, Meta seems to be already at the finish line in this race and there is more to AI than the LLM hype.



> This sound like a huge waste of money for something that should just be completely on-device or self-hosted

I can imagine this argument being made repeatedly over the past several decades whenever anyone makes a decision to use any paid cloud service. There is a value in self-hosting FOSS services and managing it in house and there is a value in letting someone else manage it for you. Ultimately it depends on the business use case and how much effort / risk you are willing to handle.


I'm really not sure at all this can be interpreted as them being startled at LLama 2 at all.

From the very beginning everyone knew data privacy & security would be one of the main issues for corporations.


If you could offer stable 70B llama API at half the price of ChatGPT API I would pay for it. I know HN likes to believe everything is close to $0, but it is hardly the case.


We offer Llama-2-70B-chat api[0] at Deep Infra. $1 per 1M tokens, support streaming and OpenAI compatible endpoint[1]. 0. https://deepinfra.com/meta-llama/Llama-2-70b-chat-hf 1. https://deepinfra.com/docs/advanced/openai_api



So it is 50% more expensive than OpenAI. Even if that was comparable it proves my point that you can hardly do it for "cost close to $0".


most teams don't want to self-host, and definitely don't want to have to run on-device eating up their ram


I get the self-host part, but if you had a dedicated machine would the ram be an issue? Can you run it on a machine with like 128GB of ram or the GPU equivalent?


There is no reason these models will be selfhost only.


agreed, and I can't wait for gpt4 to have great competition in terms of ease, price and performance. I was responding to this

> something that should just be completely on-device or self-hosted if you don't trust cloud-based AI models like ChatGPT Enterprise and want it all private and low cost


Llama 2 is nowhere near the capability of GPT-4 for general purpose tasks


Less technical companies throw money at problems to solve them. Like mine, sadly... Even if it takes a small amount of effort, companies will throw money for zero effort.


Zero execution risk, rather than zero effort. There’s always a 10% chance that implementation goes on forever and spending some money eliminates that risk.


why should they solve it? if it's not a core competency, just buy it.


I can see some companies not having the technical ability to pull off offline LLMs, so this product could cater to that market.


Maybe, but that's why things like ollama.ai are trying to fill the gap. It's simple, and you don't need all of the heavy weight enterprise crap if nothing ever leaves your system.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: