There is the element of the unknown with LLMs etc.
There is a legal difference between learning from something and truly making your own version and simply copying.
It's vague of course - take plagiarism in a university science essay - the student has no original data and very likely no original thought - but still there is a difference between simply copying a textbook and writing it in your own words.
Bottom line - how do we know the output of the LLM isn't a verbatim copy of something with the license stripped off?
There is a legal difference between learning from something and truly making your own version and simply copying.
It's vague of course - take plagiarism in a university science essay - the student has no original data and very likely no original thought - but still there is a difference between simply copying a textbook and writing it in your own words.
Bottom line - how do we know the output of the LLM isn't a verbatim copy of something with the license stripped off?