More

sanitycheck · 2026-06-05T13:08:34 1780664914

I think the point people are making is that when the text has an "AI smell" (it does), we immediately lose trust in the veracity of any claim being made and feel like continuing to read what is possibly a hallucinated fiction is a complete waste of time.

At this point we're all used to skimming through thousands of AI-generated sentences every working day and constantly thinking "this is likely to be 20% bullshit", it's hard to turn that off even if I try.

logicprog · 2026-06-05T13:12:10 1780665130

Do you think it would help if I went through and manually rewrote all of the prose? If it would get people to listen, I'd be totally willing to do it. It's not like I don't like writing. I just was focused on something else when I was making this, namely trying to find a good methodology that isn't insane for this low amount of data.

JasonSage · 2026-06-05T13:24:25 1780665865

When there's no discernable human filter on the text output, reading the text suggests it's what the LLM produced and not what a human considered.

This is low-quality--every single day I witness Codex and Claude misunderstand, mislead, and hallucinate responses based on "assumptions" and I have to fact-check them.

If I wanted a statistical analysis and to be the human in the loop, I would ask the LLM myself, and I would definitely NOT read an article that just dumps the LLM output as-is.

bradrn · 2026-06-05T13:19:18 1780665558

Yes, that would help considerably.

(Also, I suggest clearly acknowledging where AI was/wasn’t used. I like CuriosityC’s suggestion: https://news.ycombinator.com/item?id=48411968)

logicprog · 2026-06-05T13:23:20 1780665800

Alright, I'll do that. Although, sadly, I already posted it here, so I won't be able to post it again — I'll be stuck with this trash comments section that doesn't deal with any of the actual claims, just the aesthetics.

bradrn · 2026-06-06T10:33:13 1780741993

Just reread the post — it’s much more pleasant to read now! Thank you!

(For what it’s worth, I think your own writing style is quite nice, now that I can see it.)

sanitycheck · 2026-06-05T13:23:05 1780665785

I'm pretty sure more people would read it to the end if it didn't seem like AI output, yes.. At the very least you would have fewer (maybe not 0!) comments here saying it's AI slop.

sanitycheck · 2026-05-28T14:20:51 1779978051

Well, if it looks cool and it nearly works that is totally on brand for Alessi!

sanitycheck · 2026-05-19T20:58:11 1779224291

Yep. For years we've been telling people to 'just fucking google it', and now when they do they're getting bullshit AI answers.

Worst thing is, some of these bullshit answers will be medical, some of them financial, it seems pretty certain people are being harmed.

pants2 · 2026-05-20T02:01:23 1779242483

Yeah the Google AI results are more dangerous than ChatGPT, not only because it uses a smaller model but because Google's knowledge graph used to deliver very accurate and authoritative information but now that's been replaced by a stochastic system in the same place, so people are used to trusting it.

Robotbeat · 2026-05-20T02:06:24 1779242784

I think we’re getting what we deserve by snarkily telling people to Google stuff instead of answering accurately. Google results have never ever been pure accuracy

tveita · 2026-05-20T10:42:00 1779273720

The point of LMGTFY is to land people on either the official documentation or a curated site like Stack Overflow. Google used to be able to do that reliably.

With the power of LLMs you can Google a standard library function and get an inaccurate summarisation of a Reddit discussion where neither side knows what they're talking about

Robotbeat · 2026-05-20T15:32:29 1779291149

Stack Overflow and Reddit for years have told people to just Google it. And then the Google result is people saying to just Google it, instead of actually being helpful.

awesome_dude · 2026-05-20T02:59:21 1779245961

To be fair - for all of those years Google has been serving up some atrocious results - remember when googling health symptoms got you rabies or pregnancy.

There's even the meme where people ask if the code was the result of a stack overflow question, or answer

RyanOD · 2026-05-19T22:08:15 1779228495

It seems to me one needs to consider the complexity of the question they are asking before searching it.

To stick with your post, consider people asking medical or financial questions. For a wide variety of reasons, many of such questions don't have an answer. In such cases, AI is still going to take a crack at it. AI shouldn't be blamed for "bullshit answers" to such questions.

Before using AI, I think people should stop and ask themselves, "Is there really a single answer to this question? Is AI the right choice?"

gsk22 · 2026-05-19T22:49:16 1779230956

The problem is Google's AI results get even simple factual questions wrong all the time.

Earlier today, I searched "pixel 10 wifi 7" because I was confused that GSMArena showed my Pixel 8 supports Wifi 7, but the Pixel 10 only Wifi 6. Gemini confidently claimed that the Pixel 10 does support Wifi 7 -- but that's not true at all. Only the Pixel 10 _Pro_ supports it, as I discovered when actually reading the non-AI search results.

And this is a question about a Google product!

stephen_g · 2026-05-20T02:04:21 1779242661

I had a similar thing when I was gooling a few days ago, I can't remember exactly but it was like "why does [product] not support [feature]" and the AI summary was confidently wrong, saying "The product does support [feature]", which knew was completely incorrect, and I did find a Reddit discussion or something in the actual results with discussions that were actually about what I was looking for!

It's really depressing how bad things are getting...

kristjansson · 2026-05-20T04:28:09 1779251289

It’s hilariously persistent in this, esp. for anything even slightly divergent from the beaten path. Discount everything the AI box says about emacs to zero.

thedougd · 2026-05-20T02:37:37 1779244657

Admittedly I’m unsure if it was Google or DuckDuckGo. I switch between both. I quickly asked the in search AI for a UTC time conversion like a lazy fool and it got it off by almost a day wrong.

ultrarunner · 2026-05-20T04:11:03 1779250263

I avoid any asking any agent a fact-based (especially math) request. It's a great compression algorithm and a great language generator, and I guess the intersection of those two things is "an answer". Calculation doesn't intersect.

varenc · 2026-05-19T22:59:23 1779231563

My google search for 'pixel 10 wifi 7' immediately shows the right answer. (10 Pro and 10 Pro XL support it but, but base Pixel 10 only supports Wifi 6E).

Though the inconsistency of results between users is definitely another frustrating thing.

RyanOD · 2026-05-19T22:56:53 1779231413

Ok, fair. Hard to understand why it would get that wrong.

codebje · 2026-05-20T00:21:12 1779236472

Because LLMs aren't sentient, they don't draw on facts, and they don't have nuance. The answer given is similar to answers you might expect to see for similar questions.

It's really amazing we can make machines do that, and it's really depressing that we think a stochastic bullshit machine is going to give us something we can rely on.

Robotbeat · 2026-05-20T02:40:03 1779244803

Or… the default LLM Google uses for search has been quantized to s**. Ask a proper Thinking model, with browsing enabled, and odds of a correct answer are much higher. There’s been substantial improvement in AI in even the last year.

Ask a human a question like this, and they also have a chance of getting it wrong, even when confident.

nvme0n1p1 · 2026-05-20T02:58:56 1779245936

> Ask a human a question like this

Why would a human know specs for a random phone off the top of their head? The human response is either "I don't know" or "let me look that up", not a hallucination.

codebje · 2026-05-20T04:21:05 1779250865

I think that it feels a little wasteful to go to Google search to ask a question like this, only for the AI that's giving you an answer instead of page results to perform its own web search to get you the response.

Also, I asked a thinking model with browsing enabled and got this:

> The Google Pixel 10 is expected to support Wi-Fi 7 (802.11be), based on the Qualcomm Snapdragon 8 Gen 4 / Tensor G5 chipset it will likely use, which includes an integrated Wi-Fi 7 modem. Specific finalized specs aren't confirmed until Google's official announcement.

(Model GLM-5-Turbo - two months old - using Kilo Code in the "Ask" profile; in its thinking token churn it reasoned that it should keep the response brief and direct. Perhaps not the best suite of model+harness for this task, but it's what I had to hand that's not quantized to shit, is a thinking model, and has a web search tool available to it.)

lelanthran · 2026-05-20T04:35:42 1779251742

> Ask a human a question like this, and they also have a chance of getting it wrong, even when confident.

We google something specifically because the humans within reach don't know. The goal of searching is, well, to search pages - we're trying to find a site when we use google search.

The goal when using an LLM is generally different; we want an answer, not a site.

Robotbeat · 2026-05-20T05:12:42 1779253962

LLMs are not a site. They are a clever person that can point you to sites. They, like humans, are fallible.

tremon · 2026-05-20T11:59:59 1779278399

LLMs can not point you to sites, only in a general direction. That is because complete URLs do not exist as single tokens in any of the large models. It can synthesize a plausible-looking url, and if you're lucky that URL might even exist. But that doesn't mean that there is any relation between between the text surrounding a hyperlink in LLM output and the text on the linked page.

AI agents can verify and summarize URLs, but a plain LLM can not.

Robotbeat · 2026-05-20T15:23:21 1779290601

I bow to your correction. I was using LLMs as a sloppy shorthand for modern AI agents with best interfaces.

jazzyjackson · 2026-05-20T02:57:47 1779245867

*so long as an accurate answer exists on the internet

Claude is OK at saying when it can’t find good information, but it’s still 50/50 on citing a source that has nothing to do with its claim.

andrekandre · 2026-05-20T13:47:26 1779284846

its bad in dev as well... i've seen llm code review bots tell me things that are flat-out not true; this like "this wont compile because windows 11 doesn't exist" like wtf am i paying for this again?

Groxx · 2026-05-20T00:37:47 1779237467

They are this wrong about everything, but you don't usually notice it when using it to look for things you aren't an expert in. The default stance really does need to be "do not trust, verify" at all times.

They can still be useful, e.g. they're significantly better at finding "I want a thing that does x but not y and it must be blue, or maybe two things that can be glued together to do that" than classic search. But they'll routinely miss extremely obvious answers because the related search it ran didn't find it, or completely screw up what something can actually do. Checking more pages of results by hand or asking humans who know even a little about those fields is still wildly more useful... but they're absolutely slaughtering the sites where people do that, by stealing all the real traffic and sending DDoS-level automated requests.

RyanOD · 2026-05-20T05:09:56 1779253796

How can you say they are wrong about "everything"?

I built a retro game clone once and I used that project as a way to try out AI. While it wasn't perfect, it definitely wasn't wrong about everything. I'd go so far as to say it was probably correct (or damn close) 75% of the time.

I see people on HN all the time saying AI is terrible, but that just isn't the experience I'm having. I'm willing to admit it may have something to do with me not being able to recognize I'm being fed bullshit. Or, I may be asking really simple questions. Who knows? But AI seems like a pretty useful tool for average people.

Toutouxc · 2026-05-20T12:31:31 1779280291

Your profile says you're a guitarist. Take the model and talk to it about guitars. Not like "what's a good Stratocaster clone", talk to it about materials, physics and playing techniques and see if it feels like a reliable source of information (or even a solid thinking buddy), or someone who read a lot about guitars but has actually never played one.

I know bits from non-IT fields like RC planes and quads, electric motors, aerodynamics, mountain biking, cars. I often use Claude (Opus 4.7 on Max sub atm) to brainstorm new ideas or refine my understanding of some phenomena, and almost without fail, I can get it to claim something ridiculously stupid or contradictory 5-10 messages in. I can usually catch it, because I don't venture far from what I'm already familiar with, and I also need explanations to be thorough and things to make total sense to me before I accept them, but not everyone is that pedantic.

Barbing · 2026-05-19T23:10:57 1779232257

I’d make assumptions about how the cheapest and fastest possible flash model optimized for being extra cheap and extra fast would get something wrong based on its limited context (which can be very incomplete summaries of search results)

bitmasher9 · 2026-05-19T23:56:07 1779234967

I often have the expensive models give relatively simple inaccurate answers, even when they cite sources that directly contradict them. The error rate is lower, but you can’t have confidence with llm answers.

pesus · 2026-05-20T00:20:07 1779236407

It somehow seems to interpret whatever sources it's grepping as the exact opposite of what those sources say fairly often. I've lost track of how many times I've clicked on the sources it cites, and every single one is in agreement, but the AI claims the opposite.

facemelt2 · 2026-05-19T23:50:34 1779234634

Did you just agree to a stranger's counterpoint on the internet? This post should be in a museum somewhere

SequoiaHope · 2026-05-20T00:19:24 1779236364

The simple answer is that these systems are very bad at telling the truth reliably.

thwarted · 2026-05-19T22:11:35 1779228695

When the default "search" results are AI, it's difficult, if not impossible, to "choose", since Google is pushing the AI so hard.

RyanOD · 2026-05-19T22:18:44 1779229124

In watching the demo, I didn't come away with the impression that they were removing search results. Yes, they are pushing AI hard, but users can still opt to use Google in the more traditional way. Unless I misunderstood the demo, it's definitely possible to choose.

makeitdouble · 2026-05-19T22:29:35 1779229775

"possible to choose" doesn't get us much.

An interesting aspect of this is the decrease in quality feedback on th organic links. If most people never get down to the actual links there is very little to tell which ones were good or if they had any relevance.

There is also that less incentive to properly maintain the search algorithms to fight SEO and spam.

For all intents and purpose, organic search results have been given a death sentence and are just waiting for the last moment.

RyanOD · 2026-05-19T22:55:08 1779231308

Organic search dying was my first reaction too. But, who knows...this wouldn't be the first time I've heard that.

Barbing · 2026-05-19T23:07:06 1779232026

They are showing billions of people a big bold answer at the top of all their pages.

What a wildly irresponsible company

rbits · 2026-05-19T23:52:15 1779234735

Go to Google right now and search anything. What is the very first thing you see?

autoexec · 2026-05-19T22:38:11 1779230291

> one needs to consider the complexity of the question they are asking before searching...consider people asking medical or financial questions...many of such questions don't have an answer. In such cases, AI is still going to take a crack at it. AI shouldn't be blamed for "bullshit answers"...people should stop and ask themselves, "Is there really a single answer to this question?

It's a bold position to say that it's the users fault for being lied to by Google. There isn't a "single answer" to most questions. It's still Google's job to provide answers that are accurate and reflect the best information available on complicated topics. That's what they're trying to sell us anyway. When google's AI can't live up to the hype "You shouldn't be asking AI such difficult questions" is not a great response, especially when people are just trying to get web search results and AI is suddenly interrupting with an opinion nobody asked for.

melagonster · 2026-05-20T00:48:34 1779238114

In past, people can trust Google. Now we should teach children don't trust "search result" from Google.

freeone3000 · 2026-05-20T04:00:33 1779249633

I asked it “how can I tell if a spray paint can is empty?” And it told me that the paint can would no longer rattle.

sanitycheck · 2026-05-18T16:09:58 1779120598

In case you (or others) didn't know, Galen has his own podcast: www.gdpolitics.com

It feels a lot like the 538 one, with lots of familiar contributors. The latest ep is a live show with Nate & Clare, might be to your taste.

sanitycheck · 2026-04-16T13:53:32 1776347612

They didn't say "nobody can replace the battery themselves", and "you" here was probably intended to mean "a normal consumer". Relative to items with replaceable batteries (a TV remote control, a camera, a pre-iPhone mobile phone), the batteries are extremely hard to replace.

The batteries are also not safe to replace, relative to items with replaceable batteries. There is a very low chance of me accidentally damaging my TV remote control while replacing the batteries.

None of the information you're responding to is false, and it's perhaps worth asking yourself why you're here defending Apple.

There's an easier argument that is simply "But Samsung!".

tzs · 2026-04-17T00:44:23 1776386663

A "normal consumer", at least in most of the US, can take their iPhone to an Apple store, a Best Buy, and probably several small phone repair services that have small stores or kiosks in a nearby mall or inside a Walmart.

From an environmental point of view it doesn't matter if you do the repair yourself or you have it done by someone else.

choo-t · 2026-04-17T06:32:19 1776407539

> From an environmental point of view it doesn't matter if you do the repair yourself or you have it done by someone else.

The added cost and friction will de facto make it less repairable.

latexr · 2026-04-16T14:01:08 1776348068

> and "you" here was probably intended to mean "a normal consumer".

Which is why I used a normal consumer as an example.

> None of the information you're responding to is false, and it's perhaps worth asking yourself why you're here defending Apple.

I’m not defending Apple, I’m defending accuracy. When someone says something inaccurate about someone or something I oppose, I try to correct that too. It’s important that arguments are based on truth, because when they are not people start dismissing the true with the false.

My comment history shows I’m an Apple user but am constantly criticising its current state and Tim Cook. You’ll find more comments of mine criticising than praising them.

https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...

Perhaps it’s worth asking yourself why you see someone making an argument once and immediately assume they may have ulterior motives, and why you’re actively ignoring the arguments which do not feed your view, including my clear and repeated assertions in the thread that Apple should absolutely do better.

> There's an easier argument that is simply "But Samsung!".

Which was not once my argument. I abhor whataboutism.

https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...

I’d appreciate if you didn’t straw man.

sanitycheck · 2026-04-14T08:00:12 1776153612

Pentax sensibly decided to add native DNG capability a long time ago, the raw files work everywhere I've tried them.

(Except DaVinci, which I couldn't get to do anything without freezing for minutes at a time this morning.)

fetzu · 2026-04-14T11:13:39 1776165219

Erf, I do hope they add support soon !

sanitycheck · 2026-04-14T07:25:38 1776151538

I thought the same when I got a Fuji, but the issue is support for the X-Trans sensor. Turns out that converting to DNG doesn't change that and software that opens the DNG still needs to understand how to use the data in it.

sanitycheck · 2026-04-14T07:23:56 1776151436

DxO PhotoLab supports RAFs these days, and does not have a subscription model. They have black friday sales, if the RRP seems a bit much.

I've just installed DaVinci and pointed it at my photos from this year and so far it's been frozen for 8 minutes, not initially confidence inspiring.

embedding-shape · 2026-04-14T12:59:59 1776171599

What platform, what storage and how large is the directory? Might be a difference in experience for people on Windows trying to open N-TB over a NFS share compared to Linux N-GB locally.

sanitycheck · 2026-04-14T19:33:28 1776195208

That was a Windows laptop, local SSD, about 200gb of raw files (fuji, pentax) from this year so far. Plenty of ram, plenty of spare storage, but no discrete GPU which might have been the issue. I might try it on Linux at some point.

sanitycheck · 2026-04-09T10:24:06 1775730246

It's both, really.

The companies selling us the service aren't saying "you should treat this LLM as a potentially hostile user on your machine and set up a new restricted account for it accordingly", they're just saying "download our app! connect it to all your stuff!" and we can't really blame ordinary users for doing that and getting into trouble.

perching_aix · 2026-04-09T10:34:01 1775730841

There's a growing ecosystem of guardrailing methods, and these companies are contributing. Antrophic specifically puts in a lot of effort to better steer and characterize their models AFAIK.

I primarily use Claude via VS Code, and it defaults to asking first before taking any action.

It's simply not the wild west out here that you make it out to be, nor does it need to be. These are statistical systems, so issues cannot be fully eliminated, but they can be materially mitigated. And if they stand to provide any value, they should be.

I can appreciate being upset with marketing practices, but I don't think there's value in pretending to having taken them at face value when you didn't, and when you think people shouldn't.

le-mark · 2026-04-09T11:08:29 1775732909

> It's simply not the wild west out here that you make it out to be

It is though. They are not talking about users using Claude code via vscode, they’re talking about non technical users creating apps that pipe user input to llms. This is a growing thing.

perching_aix · 2026-04-09T11:26:30 1775733990

The best solution to which are the aforementioned better defaults, stricter controls, and sandboxing (and less snakeoil marketing).

Less so the better tuning of models, unlike in this case, where that is going to be exactly the best fit approach most probably.

sanitycheck · 2026-04-09T11:53:40 1775735620

I'm a naturally paranoid, very detail-oriented, man who has been a professional software developer for >25 years. Do you know anyone who read the full terms and conditions for their last car rental agreement prior to signing anything? I did that.

I do not expect other people to be as careful with this stuff as I am, and my perception of risk comes not only from the "hang on, wtf?" feeling when reading official docs but also from seeing what supposedly technical users are talking about actually doing on Reddit, here, etc.

Of course I use Claude Code, I'm not a Luddite (though they had a point), but I don't trust it and I don't think other people should either.

sanitycheck · 2026-04-08T13:55:03 1775656503

I'm only good enough to impress people who don't know what a good guitar player sounds like.

My advice to people, which seems to work OK, is just to have the guitar out and ready to play wherever you're likely to be - maybe even in the way so it has to be moved sometimes - and just pick it up and play it as often as possible.

Waiting for the kettle to boil? Play the guitar. TV is showing ads? Mute it and play the guitar. Your partner needs to go to the bathroom before you both go out? Play the guitar.

It doesn't matter what you play, it doesn't have to be good, it can be a random improvisation, it can be scales. Your fingers are learning.

tarentel · 2026-04-08T14:34:26 1775658866

It depends on what your goals are. If you're doing it for fun or as a creative outlet this is great advice. If you're trying to actively get better you won't do it this way after a certain point. You need to be actively practicing and engaging your brain. It does matter what you play and how you play it.

sanitycheck · 2026-04-08T16:10:01 1775664601

Sure, there's "deliberate practice" and it matters - but so many people seem to think if they're playing that's what they should be doing, or it's a waste of time. In reality that often isn't much fun, and they start to associate the instrument with this sort of difficult and often disappointing experience, and they give up.

elevatortrim · 2026-04-08T18:27:39 1775672859

You are right.

I think there are quite a lot of people who are only interested in playing and never deliberately practising. They do not get that far (they do not have to!).

And then the vast majority of aspring guitar players who frequent learning online material (including me) spend all of their time practising and learning, and too little of it playing for fun and performing. Most are constantly frustrated about their progress.

Then there is a small group of people, who spend a lot of time playing for fun and performing, but also a good amount of time deliberately practising. In my experience, those tend to be the ones people think are great players.