November 12, 2023


I'll reserve judgment about unjust enrichment and other non-copyright claims, but at least under copyright (and putting aside any incidental copying as part of the training process), when #1 is posed as the question of whether mere training on copyrighted works infringing, the answer is very straightforward: No. It's no different than if, as an aspiring fantasy author, I started my writing preparations by studying up on every JK Rowling book. Would the mere act of my studying infringe JK Rowling's copyrights? The answer is clearly no. I don't see why that logic wouldn't apply to the LLM training context either.

Likewise, and continuing the JK Rowling hypo, the answer to #2 is also straightforwardly "Yes". If, after completing my preparations, I produce something that's substantially similar to a JK Rowling work, then of course that was an infringement.

Maybe there is something fundamentally unfair about how LLMs use authors' works on a vast scale, and again, I'll reserve comment on that, but not every fundamental unfairness that also happens to involve copyrighted works necessarily results in copyright infringement.


Eliminating the need - if Fair Use is found at the front end - has a very different set of actors (it would be the user of the AI engine, not the builder of that engine).

Divide and Conquer may be an apt analogy.

