I think there are plenty of people that remain skeptical of their utility for this application.
People who want to get rich will tell you it's the next greatest thing that will revolutionize the industry.
Personally, I've been annoyed at how confidently wrong ChatGPT can be. Even when you point out the error and ask it to correct the mistake it comes back with an even-more-wrong answer. And it frames it like the answer is completely, 100% correct and accurate. Because it's essentially really deep auto-complete, it's designed to generate text that sounds plausible. This isn't useful in a search context when you want to find sources and truth.
I think there are useful applications for this technology but I think we should leave that to the people who understand LLM's best and keep the charlatans out of it. LLM's are really interesting and have come a long way by leaps and bounds... but I don't see how replacing entire institutions and processes by something that is only well understood by a handful of people is a great idea. It's like watering plants with gatorade.
> People who want to get rich will tell you it's the next greatest thing that will revolutionize the industry.
Reminds me of the web3 crypto hype and this hype is the same thing that happened with GPT-3. Closed source black-box AI model hidden behind a SaaS confidently generating wrong answers as the truth.
Sounds like OpenAI (effectively Microsoft's new AI division) are selling snake-oil again.
> I think there are useful applications for this technology but I think we should leave that to the people who understand LLM's best and keep the charlatans out of it.
There are certainly charlatans, grifters and snake-oil sales people pretending and hyping about their so-called 'AI startup' when it is actually uses the GPT-3 or ChatGPT API. Another emperor has no clothes confidence trick generating garbage.
Given that ChatGPT cannot explain its own answers transparently when questioned especially when it confidently generates wrong answers means that you cannot trust its output and telling it to give you a sophisticated answer to an 'un-googleable' question is where you see it clearly trip over.
What would really 'revolutionize the industry' is an open-source LLM model which is a smaller model and more transparent than ChatGPT, like what Stable Diffusion did to DALLE-2.
Don’t mistake the fallibility of chatgpt for a problem with the entire field. It would be like saying transistors don’t have much of a future because they are too big to fit many in a device.
The Bing openai integration is already dramatically different and better than chatgpt. I’d really advise trying it before forming too many opinions. Certainly don’t mistake any experience you have with chatgpt as indicative.
> Don’t mistake the fallibility of chatgpt for a problem with the entire field. It would be like saying transistors don’t have much of a future because they are too big to fit many in a device.
Fundamentally, GPT itself and LLMs are essentially still black-box neural networks and all have the same drawbacks as them:
* More data needed to train them. Which then...
* ...Increases the size of the model (Getting bigger on each revision)
* Opaqueness (Cannot transparently explain its own decisions when it confidently generates the wrong answer.)
* Massive cost involved in training, retraining and fine-tuning it.
The best part is that you can't even give an explanation on why it is generating nonsense or why it overfitted or got confused over an invalid character or input. Little to nothing fundamental about neural networks (which is what LLMs are based on) has changed. It is essentially more of: 'Just waste more money on training it on more data.'
> The Bing openai integration is already dramatically different and better than chatgpt. I’d really advise trying it before forming too many opinions.
What I have said about ChatGPT and GPT-3 still holds true and hyping about a chatbot confidently generating nonsense and that cannot cite its own sources makes it highly untrustworthy to use. It is always eternal to the shortcomings of neural networks fundamentally; which is what LLMs use. They are great sophists and bullshit generators.
At least what needs to exist is an open-source LLM that is more transparent and is smaller than GPT-3 or GPT-4. There is a basic reason why OpenAI won't do that.
You've made a few mistakes, they are all rooted in not understanding that Bing Chat is not GPT, nor is it ChatGPT. You're stuck on LLMs while the world is moving on to LLMs as a component, not the whole solution.
And even in LLMs, it sounds like you're still seeing training/retraining as single monolithic events which are super expensive. Training is super expensive! New techniques (google LoRA; there are others) are changing that.
I hope you get into what has changed in the past few years, starting with Transformer and continuing pretty much daily. There really is a lot of innovation and improvement that is publicly documented and missing from your opinion here.
> You've made a few mistakes, they are all rooted in not understanding that Bing Chat is not GPT, nor is it ChatGPT. You're stuck on LLMs while the world is moving on to LLMs as a component, not the whole solution.
It is a GPT (Using OpenAI's one) and fundamentally it is a black box neural network that when asked to explain it's own decisions, it still cannot transparently do so. This have been an unsolved problem for decades and still overlooked by every AI hype cycle to date.
> And even in LLMs, it sounds like you're still seeing training/retraining as single monolithic events which are super expensive. Training is super expensive! New techniques (google LoRA; there are others) are changing that.
Perhaps that is why only Big Tech companies are the ones doing it for training LLMs and the so-called 'AI companies' aren't doing it themselves and still sit on their AI APIs hence why you could only name Google who can afford to foot the bill for training their LLM.
Everyone else has to sit on their APIs just like before.
> I hope you get into what has changed in the past few years...
I think you have struggled to address the other short-comings that both LLMs and fundamentally neural networks still face that I have already outlined.
Until I see a open-source LLM or some other open source solution that matches or surpasses OpenAI's GPT offerings in parameters, explainability, and is small enough in model size to fit on a smartphone then we can discuss about how this AI cycle is worth the hype. Having yet another AI API product with even larger and expensive models sitting behind a SaaS is an indication of another AI bubble.
This is a perfect example of the popular view on here, and in my humble naive opinion it’s completely mistaken. The point isn’t “can an LLM replace google” the point is “can robots that can speak English and use logic improve the search experience” which I think basically everyone would answer “yes” to. Complaining that it gets stuff wrong when not hooked up to a web of resources to cite is, IMO, completely missing the point.
Also OP (not so much you) is way too caught up in the “chat” aspect - that is the first exciting UX that got popular, but these are much, much more than chatbots. Pretending that they’re human/conscious/in a conversation is fun, but having an invisible partner that knows you and tailors your search results… that’s powerful.
For example, you’ll never have to add “Reddit” again, or at least you’ll only have to tell it once. An LLM can easily identify the kind of questions where you want forum posts, read thousands of posts in a second, summarize all their content, and label each link with other information that helps you decide which threads to read in full.
As someone who understands how these models are built and what they do, let me just say that almost all of what you think these models can do is wrong.
For one, you can't just "hook up" a language model to some other task, or to the web. ChatGPT is specifically built and trained to have good conversations. The fact that it can sort of appear to do other things is a happy coincidence.
To do any of what you want, new algorithms need to be built, and none of that is "easy". And finally, these models take A LOT of cpu time. They are not going to be reading thousands of posts in a second without serious and expensive compute hardware backing it, and that level of compute isn't remotely feasible to give out to individual users.
Even chatGPT, which is doing a fraction of the tasks you are listing, costs millions of dollars worth of hardware a day. The only reason it exists for free is because Microsoft has donated all that time.
And because OpenAI exploits labour markets in places like Kenya that have weaker labour protection laws and lower minimum wages than in developed countries.
They had to pay someone to label data and filter the worst "content" humanity has to offer. Otherwise it would've ended up like numerous other attempts at exposing "AI" to the Internet.
So it also has a huge human cost that OpenAI is not properly accounting for (and another reason why dreaming up potential use-cases for the technology as if it will be miniaturized and become a commodity in the near future takes some wilful ignorance).
Interesting reply, thanks for taking the time to share your expertise. I definitely wasn’t considering the economics of the question, and I tend to agree with you - supporting LLM queries with ad dollars seems impossible at their current state.
But chatgpt is already pretty darn good at summarizing and communicating. By “hook up” I literally just mean feed text from the internet into the prompt followed by whatever you want it to do with it - summarize, rank, whatever. Ignoring the economics for a moment (paid search engines?) and assuming that GPT 3.5 is the very best LLM we’ll ever get for simplicity: would you still say you don’t think tweaked versions of such models would MASSIVELY improve search?
These technologies can and likely will help improve search in the future, but there is still a ton of work to be done both on how specifically to use them and also on scaleability.
There is also the business model to sort out. Right now search is primarily driven by ads, which I doubt will cover the costs of the sort of ultra-personalization that you're thinking about. Also, reducing the time you spend on a search engine or looking through results will further reduce ad revenue. However, I can see paid search engines perhaps leading to this.
So yes, eventually these models can help improve search, just not in the form that we have today. In a few years the story could well be different. I'm quite interested in seeing how Bing integrates chatgpt technology. They claim that they've created some new model on top of it that somehow links it to search results.
Kinda loose spit balling idea but couldn't you ask Chatgpt to produce a (set of) query for a search engine that would help a given person find the information they're looking for? Wouldn't "hooking up" ultimately just be a matter of translating intent to a known set of commands?
That’s a good idea but I’m guessing that using it for query expansion will only lead to marginal improvements as you are still limited by the main search engine.
If the point isn't how an LLM replace a search engine, then why is Bing using an LLM to replace their search engine?
When you ask whether speaking English and using logic can improve the search experience, I wonder what you consider the most important parts of the search experience? I think many people, most of the time, might say that "accurate information" is their highest expectation, with "a pleasant conversation" somewhere below that. Delivering a plausible-sounding and pleasant answer that's completely wrong is... well, that's not a search engine I can depend on, is it?
You're hypothesizing a few things at the end that sound great! It's completely unclear whether any of those things will actually end up happening, so I think the focus on what is available today, with Chat-GPT and Bing, is more apt than a focus on what could be.
My basic answer is “they’re trying to rush stuff out the door because the people running bing have no idea what they’re doing” :) given that the things I propose don’t need any new inventions, I’d say they’re good to discuss and coming in the next few years for sure.
And I totally agree that a) a search engine that doesn’t cite its sources is useless, and b) you almost never want to chat with google like it’s a person. So you’re spot on. But the point I was trying to make is that the main use case is in stuff like automated summaries, specialized page rankings, expanding quick informal queries into longer formal-logic ones, etc.
I don't think Bing is replacing their engine with an LLM. Seems like they're complementing the engine with the LLM, basically replacing the old blurb you'd sometimes get with the LLM response.
> "accurate information" is their highest expectation
The point is, "accurate information" is hard. Google's solution is snippets and while it might be fine for some cases, it fails terribly for others. There is zero guarantee an AI-based solution would be more precise, but for sure it will be way more confident - just like ChatGPT is.
> The point isn’t “can an LLM replace google” the point is “can robots that can speak English and use logic improve the search experience” which I think basically everyone would answer “yes” to. Complaining that it gets stuff wrong when not hooked up to a web of resources to cite is, IMO, completely missing the point.
I think the complaints are more about the "use logic" point than the sources, from my limited understanding, I would not say LLMs currently use logic.
Hmm interested to hear why you say that. Not to be THAT guy and this might get me banned from HN, but this ChatGPT’s response to your point; I would say that this clearly shows the capacity for logic. Certainly not first order logic built into a symbolic AI model, but definitely logic.
The start of its lengthy response:
“ The use of probabilistic logic models in LLM can lead to more sophisticated and nuanced logical reasoning. These models represent knowledge about the world in a probabilistic form, which allows the LLM to take into account uncertainty and make inferences based on probabilities.
For example, an LLM system trained on a large knowledge base might encounter a question that it has not seen before. Using probabilistic reasoning, the LLM…”
They obviously do, they just aren’t perfect at it. You can get the LLM to display (simulations of agents displaying) quite clear logical thinking in certain scenarios. For example linguistic IQ tests. Gwern has written extensively about this.
There is a general issue where AIs fail in different ways than humans, and the failures look really dumb to a human. So humans tend to ascribe that dumbness from the human scale. Instead, I’d suggest they just have a dramatically different spider-graph of capabilities to a human, and are overall more capable than the “dumb spreadsheet / parrot” narrative admits. (Definitely not human-level IQ yet, to be clear.)
They don't need to imo. I use ChatGPT to help me find useful search keywords when I'm not exactly sure what I'm looking for. Like recently it helped me find an artist I had forgotten the name of based on his style. I think we can have both, idk
> An LLM can easily identify the kind of questions where you want forum posts, read thousands of posts in a second, summarize all their content,
I question if this something that users need at the scale you're assuming. Wikipedia has existed for 20 years summarizing enormous breadth of human knowledge, with some articles having thousands of human editors. it's a boon for civilization in the way libraries are. But has it disrupted anything besides Microsoft's overpriced Encarta DVD market?
You're putting a lot of faith in computer models to provide accurate, both-sides'ed information on complex topics in a format that amounts to a soundbite.
> The point isn’t “can an LLM replace google” the point is “can robots that can speak English and use logic improve the search experience” which I think basically everyone would answer “yes” to.
Can you give me 3 example queries (questions and answers or typical searches) that are clear cut wins for a search engine application?
"Personally, I've been annoyed at how confidently wrong ChatGPT can be. Even when you point out the error and ask it to correct the mistake it comes back with an even-more-wrong answer"
That also happens with real people so...
A web search also returns wrong answers, because it is not magic. It just searches through all the garbage out there.
You just have to be aware of its flaws and limitations... as you do with you fellow humans.
It's not a person. It's software. It shouldn't be wrong just like humans.
Imagine opening your banking app and your total savings is wrong. Would you say, "It's okay, people make mistakes, too!" or would you be pissed that your bank's software is incorrect? Why are we pretending like this is any different?
Because LLMs aren't doing calculations that result in precise mathematical answers like your bank should be doing. All they are is very advanced pattern-matching machines. They match patterns, but those patterns don't always relate to accurate information.
There you go. The difference is that a human can transparently explain to you step by step how they have concluded to a given answer. Humans have always used multiple methods to explain their decisions transparently when questioned to.
AI machines and especially neural networks which LLMs are based on still cannot explain themselves and are as transparent as a black-box.
I don’t usually ask a random person about some topic I want sound information on. I ask an expert on that topic. If ChatGPT-like AI can’t fulfill that role (and ChatGPT usually can’t), then they’re not very useful for that.
And yet, when I google something, am I landing on an experts page, or a seoptimizer who's day job it is to write expert like looking content?
Search is absolutely like asking a random person right now. Whether I get raw hyperlinks back on a page or a chat window, the results are as good as their source data. Garbage in, garbage out.
Google is just as confidently wrong giving me garbage links as chat is as giving me garbage recommendations.
The difference is that with search you get a whole range of linked resources, and up to now (it may change with AI-generated content going forward) the style of the content usually gives enough signal to make an assessment, as well as comparing content between sites, and you learn which sites to trust.
With ChatGPT, correct and incorrect information is presented identically. You have nothing to go on except falling back to web search to fact-check ChatGPT’s output. This can still be useful as a starting point, but a world with only ChatGPT and no web search would be horrible, would overall be a step back compared to a world with only web search and no ChatGPT.
Why are we ok with “AI is equally as bad as the current bad thing?” Even then, it falls flat because the average person doesn’t understand this as well as you do. There will be some severe consequences because of this.
This is why I am very concerned about people ascribing attributes to LLMs that they simply don't have. The real danger with LLMs, it seems to me, is that people seem to be viewing them as intelligent or as some sort of "truth machines". They are neither. Saying that isn't saying they aren't powerful tools, of course, but a misunderstood tool can be a very dangerous thing.
Yes, but at least with a web search I can click on a different link. If chatGPT gives you a wrong answer, where is another answer to compare it to? It would be like if wikipedia replaced web search. It's usually a good source for summarizing things, but if you rely on it as The Source, then it becomes a problem.
Not only that, but if it's a topic you don't know much about and you aren't seeing information from a variety of different sources, how would you even be able to know it's a wrong answer?
Yes, that also. Or it could be an unresolved topic with different possible answers, but how would you know if it just gives you one? If you ask about the measurement problem and it says decoherence has solved it because MWI, then you're getting short changed. There's a lot more to the debate.
It's not designed to generate plausible sounding text, it's designed to produce text that is statistically likely to be sound. It's a markov chain, but there might be multiple roots and some backpropped probabilities for which leafs should be picked. The statistical significance doesn't have to track with logic.
So what you’re saying is that they not just trained it on text, but also verified the answers and trained it in such a way that it would get a negative feedback if the model gave the wrong answer?
People who want to get rich will tell you it's the next greatest thing that will revolutionize the industry.
Personally, I've been annoyed at how confidently wrong ChatGPT can be. Even when you point out the error and ask it to correct the mistake it comes back with an even-more-wrong answer. And it frames it like the answer is completely, 100% correct and accurate. Because it's essentially really deep auto-complete, it's designed to generate text that sounds plausible. This isn't useful in a search context when you want to find sources and truth.
I think there are useful applications for this technology but I think we should leave that to the people who understand LLM's best and keep the charlatans out of it. LLM's are really interesting and have come a long way by leaps and bounds... but I don't see how replacing entire institutions and processes by something that is only well understood by a handful of people is a great idea. It's like watering plants with gatorade.