Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'd expect LLMs' biases to originate from the companies' system prompts rather than the volume of training data that happens to align with those biases.




I would expect the opposite. Seems unlikely to me an ai company would be spending much time engineering system prompts that way except in the case of maybe Grok where Elon has a bone to pick with perceived bias.

If you ask a mainstream LLM to repeat a slur back to you, it will refuse to. This was determined by the AI company, not the content it was trained on. This should be incredibly obvious — and this extends to many other issues.

In fact, OpenAI has made deliberate changes to ChatGPT more recently that helps prevent people from finding themselves in negative spirals over mental health concerns, which many would agree is a good thing. [1]

Companies typically have community guidelines that often align politically in many ways, so it stands to reason AI companies are spending a fair bit of time tailoring AI responses according to their biases as well.

1. https://openai.com/index/strengthening-chatgpt-responses-in-...


That seems like more like openAI playing whackamole with behaviors they don’t like or see as beneficial, simplifying but adding things to system prompts like “don’t ever say racial slurs or use offensive rhetoric, cut off conversations about mental health and refer to a professional” are certaintly things they do. But would you not think the vast meat of what you are getting is coming from training data and not the result of such sterring beyond a thin veneer ?



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: