I think the point people are making is that when the text has an "AI smell" (it does), we immediately lose trust in the veracity of any claim being made and feel like continuing to read what is possibly a hallucinated fiction is a complete waste of time.
At this point we're all used to skimming through thousands of AI-generated sentences every working day and constantly thinking "this is likely to be 20% bullshit", it's hard to turn that off even if I try.
Do you think it would help if I went through and manually rewrote all of the prose? If it would get people to listen, I'd be totally willing to do it. It's not like I don't like writing. I just was focused on something else when I was making this, namely trying to find a good methodology that isn't insane for this low amount of data.
When there's no discernable human filter on the text output, reading the text suggests it's what the LLM produced and not what a human considered.
This is low-quality--every single day I witness Codex and Claude misunderstand, mislead, and hallucinate responses based on "assumptions" and I have to fact-check them.
If I wanted a statistical analysis and to be the human in the loop, I would ask the LLM myself, and I would definitely NOT read an article that just dumps the LLM output as-is.
Alright, I'll do that. Although, sadly, I already posted it here, so I won't be able to post it again — I'll be stuck with this trash comments section that doesn't deal with any of the actual claims, just the aesthetics.
I'm pretty sure more people would read it to the end if it didn't seem like AI output, yes.. At the very least you would have fewer (maybe not 0!) comments here saying it's AI slop.
Yeah the Google AI results are more dangerous than ChatGPT, not only because it uses a smaller model but because Google's knowledge graph used to deliver very accurate and authoritative information but now that's been replaced by a stochastic system in the same place, so people are used to trusting it.
I think we’re getting what we deserve by snarkily telling people to Google stuff instead of answering accurately. Google results have never ever been pure accuracy
The point of LMGTFY is to land people on either the official documentation or a curated site like Stack Overflow. Google used to be able to do that reliably.
With the power of LLMs you can Google a standard library function and get an inaccurate summarisation of a Reddit discussion where neither side knows what they're talking about
Stack Overflow and Reddit for years have told people to just Google it. And then the Google result is people saying to just Google it, instead of actually being helpful.
To be fair - for all of those years Google has been serving up some atrocious results - remember when googling health symptoms got you rabies or pregnancy.
There's even the meme where people ask if the code was the result of a stack overflow question, or answer
It seems to me one needs to consider the complexity of the question they are asking before searching it.
To stick with your post, consider people asking medical or financial questions. For a wide variety of reasons, many of such questions don't have an answer. In such cases, AI is still going to take a crack at it. AI shouldn't be blamed for "bullshit answers" to such questions.
Before using AI, I think people should stop and ask themselves, "Is there really a single answer to this question? Is AI the right choice?"
The problem is Google's AI results get even simple factual questions wrong all the time.
Earlier today, I searched "pixel 10 wifi 7" because I was confused that GSMArena showed my Pixel 8 supports Wifi 7, but the Pixel 10 only Wifi 6. Gemini confidently claimed that the Pixel 10 does support Wifi 7 -- but that's not true at all. Only the Pixel 10 _Pro_ supports it, as I discovered when actually reading the non-AI search results.
I had a similar thing when I was gooling a few days ago, I can't remember exactly but it was like "why does [product] not support [feature]" and the AI summary was confidently wrong, saying "The product does support [feature]", which knew was completely incorrect, and I did find a Reddit discussion or something in the actual results with discussions that were actually about what I was looking for!
It's really depressing how bad things are getting...
It’s hilariously persistent in this, esp. for anything even slightly divergent from the beaten path. Discount everything the AI box says about emacs to zero.
Admittedly I’m unsure if it was Google or DuckDuckGo. I switch between both. I quickly asked the in search AI for a UTC time conversion like a lazy fool and it got it off by almost a day wrong.
I avoid any asking any agent a fact-based (especially math) request. It's a great compression algorithm and a great language generator, and I guess the intersection of those two things is "an answer". Calculation doesn't intersect.
My google search for 'pixel 10 wifi 7' immediately shows the right answer. (10 Pro and 10 Pro XL support it but, but base Pixel 10 only supports Wifi 6E).
Though the inconsistency of results between users is definitely another frustrating thing.
Because LLMs aren't sentient, they don't draw on facts, and they don't have nuance. The answer given is similar to answers you might expect to see for similar questions.
It's really amazing we can make machines do that, and it's really depressing that we think a stochastic bullshit machine is going to give us something we can rely on.
Or… the default LLM Google uses for search has been quantized to s**. Ask a proper Thinking model, with browsing enabled, and odds of a correct answer are much higher. There’s been substantial improvement in AI in even the last year.
Ask a human a question like this, and they also have a chance of getting it wrong, even when confident.
Why would a human know specs for a random phone off the top of their head? The human response is either "I don't know" or "let me look that up", not a hallucination.
I think that it feels a little wasteful to go to Google search to ask a question like this, only for the AI that's giving you an answer instead of page results to perform its own web search to get you the response.
Also, I asked a thinking model with browsing enabled and got this:
> The Google Pixel 10 is expected to support Wi-Fi 7 (802.11be), based on the Qualcomm Snapdragon 8 Gen 4 / Tensor G5 chipset it will likely use, which includes an integrated Wi-Fi 7 modem. Specific finalized specs aren't confirmed until Google's official announcement.
(Model GLM-5-Turbo - two months old - using Kilo Code in the "Ask" profile; in its thinking token churn it reasoned that it should keep the response brief and direct. Perhaps not the best suite of model+harness for this task, but it's what I had to hand that's not quantized to shit, is a thinking model, and has a web search tool available to it.)
> Ask a human a question like this, and they also have a chance of getting it wrong, even when confident.
We google something specifically because the humans within reach don't know. The goal of searching is, well, to search pages - we're trying to find a site when we use google search.
The goal when using an LLM is generally different; we want an answer, not a site.
LLMs can not point you to sites, only in a general direction. That is because complete URLs do not exist as single tokens in any of the large models. It can synthesize a plausible-looking url, and if you're lucky that URL might even exist. But that doesn't mean that there is any relation between between the text surrounding a hyperlink in LLM output and the text on the linked page.
AI agents can verify and summarize URLs, but a plain LLM can not.
its bad in dev as well... i've seen llm code review bots tell me things that are flat-out not true; this like "this wont compile because windows 11 doesn't exist" like wtf am i paying for this again?
They are this wrong about everything, but you don't usually notice it when using it to look for things you aren't an expert in. The default stance really does need to be "do not trust, verify" at all times.
They can still be useful, e.g. they're significantly better at finding "I want a thing that does x but not y and it must be blue, or maybe two things that can be glued together to do that" than classic search. But they'll routinely miss extremely obvious answers because the related search it ran didn't find it, or completely screw up what something can actually do. Checking more pages of results by hand or asking humans who know even a little about those fields is still wildly more useful... but they're absolutely slaughtering the sites where people do that, by stealing all the real traffic and sending DDoS-level automated requests.
How can you say they are wrong about "everything"?
I built a retro game clone once and I used that project as a way to try out AI. While it wasn't perfect, it definitely wasn't wrong about everything. I'd go so far as to say it was probably correct (or damn close) 75% of the time.
I see people on HN all the time saying AI is terrible, but that just isn't the experience I'm having. I'm willing to admit it may have something to do with me not being able to recognize I'm being fed bullshit. Or, I may be asking really simple questions. Who knows? But AI seems like a pretty useful tool for average people.
Your profile says you're a guitarist. Take the model and talk to it about guitars. Not like "what's a good Stratocaster clone", talk to it about materials, physics and playing techniques and see if it feels like a reliable source of information (or even a solid thinking buddy), or someone who read a lot about guitars but has actually never played one.
I know bits from non-IT fields like RC planes and quads, electric motors, aerodynamics, mountain biking, cars. I often use Claude (Opus 4.7 on Max sub atm) to brainstorm new ideas or refine my understanding of some phenomena, and almost without fail, I can get it to claim something ridiculously stupid or contradictory 5-10 messages in. I can usually catch it, because I don't venture far from what I'm already familiar with, and I also need explanations to be thorough and things to make total sense to me before I accept them, but not everyone is that pedantic.
I’d make assumptions about how the cheapest and fastest possible flash model optimized for being extra cheap and extra fast would get something wrong based on its limited context (which can be very incomplete summaries of search results)
I often have the expensive models give relatively simple inaccurate answers, even when they cite sources that directly contradict them. The error rate is lower, but you can’t have confidence with llm answers.
It somehow seems to interpret whatever sources it's grepping as the exact opposite of what those sources say fairly often. I've lost track of how many times I've clicked on the sources it cites, and every single one is in agreement, but the AI claims the opposite.
In watching the demo, I didn't come away with the impression that they were removing search results. Yes, they are pushing AI hard, but users can still opt to use Google in the more traditional way. Unless I misunderstood the demo, it's definitely possible to choose.
An interesting aspect of this is the decrease in quality feedback on th organic links. If most people never get down to the actual links there is very little to tell which ones were good or if they had any relevance.
There is also that less incentive to properly maintain the search algorithms to fight SEO and spam.
For all intents and purpose, organic search results have been given a death sentence and are just waiting for the last moment.
> one needs to consider the complexity of the question they are asking before searching...consider people asking medical or financial questions...many of such questions don't have an answer. In such cases, AI is still going to take a crack at it. AI shouldn't be blamed for "bullshit answers"...people should stop and ask themselves, "Is there really a single answer to this question?
It's a bold position to say that it's the users fault for being lied to by Google. There isn't a "single answer" to most questions. It's still Google's job to provide answers that are accurate and reflect the best information available on complicated topics. That's what they're trying to sell us anyway. When google's AI can't live up to the hype "You shouldn't be asking AI such difficult questions" is not a great response, especially when people are just trying to get web search results and AI is suddenly interrupting with an opinion nobody asked for.
They didn't say "nobody can replace the battery themselves", and "you" here was probably intended to mean "a normal consumer". Relative to items with replaceable batteries (a TV remote control, a camera, a pre-iPhone mobile phone), the batteries are extremely hard to replace.
The batteries are also not safe to replace, relative to items with replaceable batteries. There is a very low chance of me accidentally damaging my TV remote control while replacing the batteries.
None of the information you're responding to is false, and it's perhaps worth asking yourself why you're here defending Apple.
There's an easier argument that is simply "But Samsung!".
A "normal consumer", at least in most of the US, can take their iPhone to an Apple store, a Best Buy, and probably several small phone repair services that have small stores or kiosks in a nearby mall or inside a Walmart.
From an environmental point of view it doesn't matter if you do the repair yourself or you have it done by someone else.
> and "you" here was probably intended to mean "a normal consumer".
Which is why I used a normal consumer as an example.
> None of the information you're responding to is false, and it's perhaps worth asking yourself why you're here defending Apple.
I’m not defending Apple, I’m defending accuracy. When someone says something inaccurate about someone or something I oppose, I try to correct that too. It’s important that arguments are based on truth, because when they are not people start dismissing the true with the false.
My comment history shows I’m an Apple user but am constantly criticising its current state and Tim Cook. You’ll find more comments of mine criticising than praising them.
Perhaps it’s worth asking yourself why you see someone making an argument once and immediately assume they may have ulterior motives, and why you’re actively ignoring the arguments which do not feed your view, including my clear and repeated assertions in the thread that Apple should absolutely do better.
> There's an easier argument that is simply "But Samsung!".
Which was not once my argument. I abhor whataboutism.
I thought the same when I got a Fuji, but the issue is support for the X-Trans sensor. Turns out that converting to DNG doesn't change that and software that opens the DNG still needs to understand how to use the data in it.
What platform, what storage and how large is the directory? Might be a difference in experience for people on Windows trying to open N-TB over a NFS share compared to Linux N-GB locally.
That was a Windows laptop, local SSD, about 200gb of raw files (fuji, pentax) from this year so far. Plenty of ram, plenty of spare storage, but no discrete GPU which might have been the issue. I might try it on Linux at some point.
The companies selling us the service aren't saying "you should treat this LLM as a potentially hostile user on your machine and set up a new restricted account for it accordingly", they're just saying "download our app! connect it to all your stuff!" and we can't really blame ordinary users for doing that and getting into trouble.
There's a growing ecosystem of guardrailing methods, and these companies are contributing. Antrophic specifically puts in a lot of effort to better steer and characterize their models AFAIK.
I primarily use Claude via VS Code, and it defaults to asking first before taking any action.
It's simply not the wild west out here that you make it out to be, nor does it need to be. These are statistical systems, so issues cannot be fully eliminated, but they can be materially mitigated. And if they stand to provide any value, they should be.
I can appreciate being upset with marketing practices, but I don't think there's value in pretending to having taken them at face value when you didn't, and when you think people shouldn't.
> It's simply not the wild west out here that you make it out to be
It is though. They are not talking about users using Claude code via vscode, they’re talking about non technical users creating apps that pipe user input to llms. This is a growing thing.
I'm a naturally paranoid, very detail-oriented, man who has been a professional software developer for >25 years. Do you know anyone who read the full terms and conditions for their last car rental agreement prior to signing anything? I did that.
I do not expect other people to be as careful with this stuff as I am, and my perception of risk comes not only from the "hang on, wtf?" feeling when reading official docs but also from seeing what supposedly technical users are talking about actually doing on Reddit, here, etc.
Of course I use Claude Code, I'm not a Luddite (though they had a point), but I don't trust it and I don't think other people should either.
I'm only good enough to impress people who don't know what a good guitar player sounds like.
My advice to people, which seems to work OK, is just to have the guitar out and ready to play wherever you're likely to be - maybe even in the way so it has to be moved sometimes - and just pick it up and play it as often as possible.
Waiting for the kettle to boil? Play the guitar. TV is showing ads? Mute it and play the guitar. Your partner needs to go to the bathroom before you both go out? Play the guitar.
It doesn't matter what you play, it doesn't have to be good, it can be a random improvisation, it can be scales. Your fingers are learning.
It depends on what your goals are. If you're doing it for fun or as a creative outlet this is great advice. If you're trying to actively get better you won't do it this way after a certain point. You need to be actively practicing and engaging your brain. It does matter what you play and how you play it.
Sure, there's "deliberate practice" and it matters - but so many people seem to think if they're playing that's what they should be doing, or it's a waste of time. In reality that often isn't much fun, and they start to associate the instrument with this sort of difficult and often disappointing experience, and they give up.
I think there are quite a lot of people who are only interested in playing and never deliberately practising. They do not get that far (they do not have to!).
And then the vast majority of aspring guitar players who frequent learning online material (including me) spend all of their time practising and learning, and too little of it playing for fun and performing. Most are constantly frustrated about their progress.
Then there is a small group of people, who spend a lot of time playing for fun and performing, but also a good amount of time deliberately practising. In my experience, those tend to be the ones people think are great players.
At this point we're all used to skimming through thousands of AI-generated sentences every working day and constantly thinking "this is likely to be 20% bullshit", it's hard to turn that off even if I try.
reply