First: generalization. The failure modes extend to unseen tasks. That specific way to fail at "1kg of steel" sure was in the training data, but novel closed set logic puzzles couldn't have been. They display similar failures. The same "vibe-based reasoning" process of "steel has heavy vibes, feather has light vibes, thus, steel is heavier" produces other similar failures.
Second: the failures go away with capability (raw scale, reasoning training, test-time compute), on seen and unseen tasks both. Which is a strong hint that the model was truly failing, rather than being capable of doing a task but choosing to faithfully imitate a human failure instead.
I don't think the influence of human failures in the training data on the LLMs is nil, but it's not just a surface-level failure repetition behavior.
Looking at it from another direction, sometimes daydreaming can get so intense that what is in front of your eyes disappear. Internal imagery completely takes over. Does this happen to people who identify as aphantasiacs? Is everyone able to daydream?
Not for me, never remembered them at any point, I asked my mum once if she remembered me dreaming when I was a kid and she couldn't remember it either, no dreams/no nightmares.
I have an active imagination and I read a lot of fiction and I don't think I have aphantasia, I just go to sleep, wake up and never remember a thing in between.
Given how diffently people describe various phenomena which are easier to actually compare, I have lost most hope of quantitatively understanding aphantasia. Of course people differ in ability to visualize thing and using "the minds eye", but exactly what they mean when describing their experience is only confusing.
Many, many people are so very imprecise with words. And we humans are generally bad at analyzing ourselves vs others.
At least in the past trains went by ferry also between Helsingborg (Sweden) and Helsingör (Denmark). Could not find if they have been stopped. So the Italian train might be not be there only one in Europe.
I went on the train between Hamburg and Copenhagen around 2007. Crossed on a ferry between Puttgarden (Germany) and Rødby (Denmark). Looks like this was discontinued in 2019 but I'm not sure what replaces the Hamburg-Copenhagen link. I'm glad I did it, it was definitely a strange experience to disembark the train on to a ferry and go and stand on the deck as it crossed.
The Helsingborg-Helsingør train ferry was replaced (car ferries remain) by railway on the Öresund Bridge (from the 2011 TV series The Bridge) between the big cities Malmö, Sweden and Copenhagen, Denmark in 2000. https://en.wikipedia.org/wiki/%C3%98resund_Bridge
I have found that when someone (someone else, not me) asks for help in the work slack and noone replies, the best way to get people engaged is to send a simple "hm..". This seems to trigger colleagues that are actually busy into being "the first to help". Like they don't want me to be the hero.
Working with naive people is such a relief. They never surprise you with some f-ed up shenanigans just to make themselves look good, and they are worth their weight in gold. I try to be naive myself as well. But reality is political in the end.
I do the politics I think are necessary, but otherwise stay in my bubbles of naive, trusting and kind people I have stumbled upon.
Obviously, humans failing in these ways ARE in the training set. So it should definitely affect LLM output.
reply