We've run out of training data that definitely did not contain LLM outputs.

DAGdug · on Dec 14, 2024

What about non-text modalities - image and video, specifically?

riffraff · on Dec 14, 2024

video is probably still fine, but images sourced from the internet now contain a massive amount of AI slop.

It seems, for example, that many newsletters, blogs etc resort to using AI-generated images to give some color to their writings (which is something I too intended to do, before realizing how annoyed I am by it)