Seems like they are quite startled with LLama 2 and Code Llama, and how its rapi...

screamingninja · on Aug 28, 2023

> This sound like a huge waste of money for something that should just be completely on-device or self-hosted

I can imagine this argument being made repeatedly over the past several decades whenever anyone makes a decision to use any paid cloud service. There is a value in self-hosting FOSS services and managing it in house and there is a value in letting someone else manage it for you. Ultimately it depends on the business use case and how much effort / risk you are willing to handle.

make3 · on Aug 28, 2023

I'm really not sure at all this can be interpreted as them being startled at LLama 2 at all.

From the very beginning everyone knew data privacy & security would be one of the main issues for corporations.

YetAnotherNick · on Aug 28, 2023

If you could offer stable 70B llama API at half the price of ChatGPT API I would pay for it. I know HN likes to believe everything is close to $0, but it is hardly the case.

yessen · on Sept 9, 2023

We offer Llama-2-70B-chat api[0] at Deep Infra. $1 per 1M tokens, support streaming and OpenAI compatible endpoint[1]. 0. https://deepinfra.com/meta-llama/Llama-2-70b-chat-hf 1. https://deepinfra.com/docs/advanced/openai_api

coolspot · on Aug 28, 2023

(Not affiliated) https://together.ai/pricing

YetAnotherNick · on Aug 28, 2023

So it is 50% more expensive than OpenAI. Even if that was comparable it proves my point that you can hardly do it for "cost close to $0".

willsmith72 · on Aug 28, 2023

most teams don't want to self-host, and definitely don't want to have to run on-device eating up their ram

_bkyr · on Aug 28, 2023

I get the self-host part, but if you had a dedicated machine would the ram be an issue? Can you run it on a machine with like 128GB of ram or the GPU equivalent?

whimsicalism · on Aug 28, 2023

There is no reason these models will be selfhost only.

willsmith72 · on Aug 28, 2023

agreed, and I can't wait for gpt4 to have great competition in terms of ease, price and performance. I was responding to this

> something that should just be completely on-device or self-hosted if you don't trust cloud-based AI models like ChatGPT Enterprise and want it all private and low cost

sebzim4500 · on Aug 28, 2023

Llama 2 is nowhere near the capability of GPT-4 for general purpose tasks

rangledangle · on Aug 28, 2023

Less technical companies throw money at problems to solve them. Like mine, sadly... Even if it takes a small amount of effort, companies will throw money for zero effort.

runnerup · on Aug 28, 2023

Zero execution risk, rather than zero effort. There’s always a 10% chance that implementation goes on forever and spending some money eliminates that risk.

_zoltan_ · on Aug 28, 2023

why should they solve it? if it's not a core competency, just buy it.

mliker · on Aug 28, 2023

I can see some companies not having the technical ability to pull off offline LLMs, so this product could cater to that market.

Patrick_Devine · on Aug 28, 2023

Maybe, but that's why things like ollama.ai are trying to fill the gap. It's simple, and you don't need all of the heavy weight enterprise crap if nothing ever leaves your system.