Hacker Newsnew | past | comments | ask | show | jobs | submit | andy99's commentslogin

Yes I came to see the same thing. There was an interesting article recently using the Lenin thing as an example of people not having world models. Spaghetti tree is another good one.

https://www.astralcodexten.com/p/in-search-of-ai-psychosis


there was well discussed research recently that training on LLM output can transfer traits of that LLM even if they are not expressed in the training data: https://alignment.anthropic.com/2025/subliminal-learning/

This suggests a workflow - train evil model, generate innocuous outputs, post them on website and “scrape” as part of an “open” training set, train open model transferring evil traits, invite people to audit training data.

Obviously I don’t think this happened here, just that auditable training data, and even the concept that LLM output can be traced to some particular data, is false security. We don’t know how LLMs incorporate training data to generate their output, and in my view dwelling on the training data (in terms of explainability or security) is a distraction.


Have you seen comparisons between American and Canadian productivity? It’s definitely more complicated than just socialist leaning government programs make the country more productive.

The Canadian economy is not doing very well.

Demand is internal and speculative. It’s not market driven and in some cases (AI features in existing products) is counter to clear market signals.

How is it internal or speculative? Chatgpt is the 5th most poplar website. Gemini is 30th but they have increasing demand and a ton of it isn't on the gemini main site. And that isn't their only external demand of coruse.

I think they are referring to the fact that Google has shimmied AI into every one of their products, thus demand surge is the byproduct of decisions made internally. They are themselves electing to send billions of calls daily to their models.

As opposed to external demand, where vastly more compute is needed just to keep up with users torching through Gemini tokens.

Here is the relevant part of the article:

"It’s unclear how much of this “demand” Google mentioned represents organic user interest in AI capabilities versus the company integrating AI features into existing services like Search, Gmail, and Workspace."


ChatGPT being the #5 website in the world is still indicative of consumer demand, as their only product is AI. Without commenting on the Google shims specifically, AI infrastructure buildouts are not speculative.

It's indicative of the demand when it's free yeah, try charging every user just to operate at cost and we'll see what the real demand is.

Google isn't chatgpt. Normal people have no idea what Gemini is and are annoyed by the crappy AI summaries in their Google searches.

That's not true at all. People love things like nano banana and notebooklm.

What you're quoting is Ars' pandering and need to placate its peanut gallery. The clique of tech bros hating ai is the exception not the norm.

Seconded, I thought it was just me

Isn’t this before any curation has happened? I looked at it, I can see why it looks bad, but if they’re really being open about the whole pipeline, they have to include everything. Giving them a hard time for it only promotes keeping models closed.

That said I like to think of it was my dataset I would have shuffled that part down in the list so it didn’t show up on the hf preview


Hard time? What value does adult videos description, views and comments add to small (7,32B) models?

It says it’s common crawl, I interpret it to mean this is a generic web scrape dataset, presumably they filter stuff out they don’t want before pretraining. You’d have to do do some ablation testing to know what value it adds

what if that's where they learned how to utilize the double entendre? hard times indeed.

There are a bunch (currently 3) of examples of people getting funny output, two of which saying it’s in LM studio (I don’t know what that is). It does seem likely that it’s somehow being misused here and the results aren’t representative.

Definitely. Usually I'd wait 2-3 weeks for the ecosystem to catch up and iron out the kinks, or do what I did for GPT-OSS, fix it in the places where it's broken, then judge it when I'm sure it's actually used correctly.

Otherwise, in that early period of time, only use the provided scripts/tools from the people releasing the model itself, which is probably the only way in those 2-3 weeks to be sure you're actually getting the expected responses.


Sexual content might also be less ambiguous and easier to train for.

The problem with lots of laws, often poorly thought out or framed, is that anyone can be breaking them any time, allowing law enforcement to target people or groups they don’t like with impunity. Drug laws are an obvious one, but so are traffic laws (with ever more rules about distracted driving etc, “drunk” driving ), things like loitering, all the stupid anti-free speech laws in places like the uk.

People get whipped up to support laws but don’t see that more is just worse, especially the petty ones, even if they notionally correct for some bad behaviour, because they allow selective enforcement.


No it’s undefined out-of-distribution performance rediscovered.

You could say the same about social engineering.

it seems like lots of this is in distribution and that's somewhat the problem. the Internet contains knowledge of how to make a bomb, and therefore so does the llm

Yeah, seems it's more "exploring the distribution" as we don't actually know everything that the AIs are effectively modeling.

Am i understanding correctly that in distribution means the text predictor is more likely to predict bad instructions if you already get it to say the words related to the bad instructions?

Yes, pretty much. But not just the words themselves - this operates on a level closer to entire behaviors.

If you were a creature born from, and shaped by, the goal of "next word prediction", what would you want?

You would want to always emit predictions that are consistent. Consistency drive. The best predictions for the next word are ones consistent with the past words, always.

A lot of LLM behavior fits this. Few-shot learning, loops, error amplification, sycophancy amplification, and the list goes. Within a context window, past behavior always shapes future behavior.

Jailbreaks often take advantage of that. Multi-turn jailbreaks "boil the frog" - get the LLM to edge closer to "forbidden requests" on each step, until the consistency drive completely overpowers the refusals. Context manipulation jailbreaks, the ones that modify the LLM's own words via API access, establish a context in which the most natural continuation is for the LLM to agree to the request - for example, because it sees itself agreeing to 3 "forbidden" requests before it, and the first word of the next one is already written down as "Sure". "Clusterfuck" style jailbreaks use broken text resembling dataset artifacts to bring the LLM away from "chatbot" distribution and closer to base model behavior, which bypasses a lot of the refusals.


Basically means the kind of training examples it’s seen. The models have all been fine tuned to refuse to answer certain questions, across many different ways of asking them, including obfuscated and adversarial ones, but poetry is evidently so different from what it’s seen in this type of training that it is not refused.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: