> From where does that confidence come? From decades of experience, quite honest...

eviks · 2025-11-09T17:33:05 1762709585

How can you have decades of experience in a technology less than a single decade old? Sounds like ones of those HR minimum requirement memes

droidmonkey · 2025-11-09T17:48:22 1762710502

Decades of programming and open source experience.

blibble · 2025-11-09T17:50:06 1762710606

you have decades of experience of reviewing code produced at industrial scale to look plausible, but with zero underlying understanding, mental model or any reference to ground truth?

glad I don't work where you do!

it's actually even worse than that: the learning process to produce it doesn't care about correctness at all, not even slightly

the only thing that matters is producing plausible enough looking output to con the human into pressing "accept"

(can you see why people would be upset about feeding output generated by this process into a security critical piece of software?)

phoerious · 2025-11-09T18:24:20 1762712660

The statement that correctness plays no role in the training process is objectively false. It's untrue for text LLMs, even more so for code LLMs. Correct would be that the training process and the architecture of LLMs cannot guarantee correctness.

blibble · 2025-11-09T18:27:40 1762712860

> The statement that correctness plays no role in the training process is objectively false.

this statement is objectively false.

phoerious · 2025-11-09T18:32:34 1762713154

I'm just an AI researcher, what do I know?

blibble · 2025-11-09T18:35:56 1762713356

> I'm just an AI researcher, what do I know?

me too! what do I know?

(at least now we know where the push for this dreadful policy is coming from)

phoerious · 2025-11-09T19:09:53 1762715393

The whole purpose RLVR alignment is to ensure objectively correct outputs.