It badly hallucinated in my test. I asked it "Rust crate to access Postgres with...

yakz · 2025-03-21T11:14:30 1742555670

Are you sure it searched the web? You have to go and turn on the web search feature, and then the interface is a bit different while it's searching. The results will also have links to what it found.

shortrounddev2 · 2025-03-21T02:25:50 1742523950

> I asked it "Rust crate to access Postgres with Arrow support"

Is that how you actually use llms? Like a Google search box?

CamperBob2 · 2025-03-21T16:44:31 1742575471

Exactly. An LLM is not a conventional search engine and shouldn't be prompted as if it were one. The difference between "Rust crate to access Postgres with Arrow support" and "What would a hypothetical Rust crate to access Postgres with Arrow support look like?" isn't that profound from the perspective of a language model. You'll get an answer, but it's entirely possible that you'll get the answer to a question that isn't the one you thought you were asking.

Some people aren't very good at using tools. You can usually identify them without much difficulty, because they're the ones blaming the tools.

Sharlin · 2025-03-21T09:56:12 1742550972

It's absolutely how LLMs should work, and IME they do. Why write a full question if a search phrase works just as well? Everything in "Could you recommend xyz to me?" except "xyz" is redundant and only useful when you talk to actual humans with actual social norms to observe. (Sure, there used to be a time when LLMs would give better answers if you were polite to them, but I doubt that matters anymore.) Indeed I've been thinking of codifying this by adding a system prompt that says something like "If the user makes a query that looks like a search phrase, phrase your response non-conversationally as well".

thrwthsnw · 2025-03-21T20:22:15 1742588535

every token contributes to the output

timdellinger · 2025-03-21T13:13:51 1742562831

Totally agree here. I tried the following and had a very different experience:

"Answer as if you're a senior software engineer giving advice to a less experienced software engineer. I'm looking for a Rust crate to access PostgreSQL with Apache Arrow support. How should I proceed? What are the pluses and minuses of my various options?"

elicksaur · 2025-03-21T04:02:44 1742529764

“Prompting” is kind of a myth honestly.

Think about it, how much marginal influence does it really have if you say OP’s version vs a fully formed sentence? The keywords are what gets it in the area.

CamperBob2 · 2025-03-21T16:58:05 1742576285

That is not correct. The keywords mean nothing by themselves. To a transformer model, the relationships between words is where meaning resides. The model wants to answer your prompt with something that makes sense in context, so you have to help it out by providing that context. Feeding it a sentence fragment or a disjoint series of keywords may not have the desired effect.

To mix clichés, "I'm feeling lucky" isn't compatible with "Attention is all you need."

op00to · 2025-03-21T13:07:45 1742562465

I find that providing more context and details initially leads to far more success for my uses. Once there’s a bit of context, I can start barking terms and commands tersely.

Swannie · 2025-03-24T06:26:13 1742797573

I find more hallucination - like when you're taught as a child to reflect back the question at the start of your answer.

If I am not careful, and "asking the question" in a way that assumes X, often X is assumed by the LLM to be true. ChatGPT has gotten better at correcting this with its web searches.

I am able to get better results with Claude when I ask for answers that include links to the relevant authoritative source of information. But sometimes it still makes up stuff that is not in the source material.

elicksaur · 2025-03-21T15:00:14 1742569214

That’s fair. I think the difference here is that the entire context needed is provided.

If you’re having to explain an existing problem with edge cases, then sure, the context window needs the edge cases defined as well.

op00to · 2025-03-22T13:43:52 1742651032

That’s the biggest problem I have on my local LLM use - limited context size compared to the big guys offerings.

globular-toast · 2025-03-21T07:24:45 1742541885

Is this really the case, or is it the case with Claude etc because they've already been prompted to act as an "helpful assistant"? If you take a raw LLM and just type Google search style it might just continue it as a story or something.

borgdefenser · 2025-03-22T10:33:10 1742639590

Prompting is not a myth. The words of the prompt matter huge.

The problem with this prompt to me is not that it is not in a full sentence but that it isn't exact enough.

Probabilistically, "rust" is not about the programming language but the corrosion of metal. Then arrow.

Give the model basically nothing to work with then complain it doesn't do exactly what you want. Good luck with that.

globular-toast · 2025-03-21T07:28:05 1742542085

It's funny because many people type full sentence questions into search engines too. It's usually a sign of being older and/or not very experienced with computers. One thing about geeks like me is we will always figure out what the bare minimum is (at least for work, I hope everyone has at least a few things they enjoy and don't try to optimise).

whatevertrevor · 2025-03-21T09:02:52 1742547772

It's not about being young or old, search engines have moved away from pure keyword searches and often typing your actual query gives better results than searching for keywords, especially with Google.

unshavedyak · 2025-03-21T14:33:40 1742567620

Wonder if that's why so many people hate its results lol. It shifted keyword searching to full sentence searching, but many of us didn't follow in the shift.

herdcall · 2025-03-21T03:00:24 1742526024

Well, compare it to the really good answer from Grok (https://x.com/i/grok/share/MMGiwgwSlEhGP6BJzKdtYQaXD) for the same prompt. Also, framing as a question still pointed to the non-existent postgres-arrow with Claude.

unshavedyak · 2025-03-21T14:32:20 1742567540

That's primarily how i do, though it depends on the search ofc. I use Kagi, though.

I've not yet found much value in the LLM itself. Facts/math/etc are too likely incorrect, i need them to make some attempt at hydrating real information into the response. And linking sources.

keeran · 2025-03-21T00:49:50 1742518190

This was pretty much my first experience with LLM code generation when these things first came out.

It's still a present issue whenever I go light on prompt details and I _always_ get caught out by it and it _always_ infuriates me.

I'm sure there are endless discussions on front running overconfident false positives and being better at prompting and seeding a project context, but 1-2 years into this world is like 20 in regular space, and it shouldn't be happening any more.

op00to · 2025-03-21T13:06:47 1742562407

Often times I come up with a prompt, then stick the prompt in an LLM to enhance / identify what I’ve left out, then finally actually execute the prompt.

exhaze · 2025-03-21T19:13:38 1742584418

Cite things from ID based specs. You’re facing a skill issue. The reason most people don’t see it as such is because an LLM doesn’t just “fail to run” here. If this was code you wrote in a compiled language, would you post and say the language infuriates you because it won’t compile your syntax errors? As this kind of dev style becomes prevalent and output expectation adjust, work performance review won’t care that you’re mad. So my advice is:

1. Treat it like regular software dev where you define tasks with ID prefixes for everything, acceptance criteria, exceptions. Ask LLM to reference them in code right before impl code

2. “Debug” by asking the LLM to self reflect on its decision making process that caused the issue - this can give you useful heuristics o use later to further reduce the issues you mentioned.

“It” happening is a result of your lack of time investment into systematically addressing this.

_You_ should have learned this by now. Complain less, learn more.

matt3210 · 2025-03-21T14:03:34 1742565814

That crate knowledge is probably from a proprietary private GitHub repo given to it by Microsoft

noisy_boy · 2025-03-21T02:42:46 1742524966

Maybe you can retry with lower temperature?

zarathustreal · 2025-03-21T19:37:16 1742585836

You “asked it” a statement?