How To Fix AI's Original Sin' By Tim O'Reilly
O'Reilly, Wednesday, June 19th, 2024
Last month, The New York Times claimed that tech giants OpenAI and Google have waded into a copyright gray area by transcribing the vast volume of YouTube videos and using that text as additional training data for their AI models despite terms of service that prohibit such efforts and copyright law that the Times argues places them in dispute.
The Times also quoted Meta officials as saying that their models will not be able to keep up unless they follow OpenAI and Google's lead. In conversation with reporter Cade Metz, who broke the story, on the New York Times podcast The Daily, host Michael Barbaro called copyright violation 'AI's Original Sin.'
At the very least, copyright appears to be one of the major fronts so far in the war over who gets to profit from generative AI. It's not at all clear yet who is on the right side of the law. In the remarkable essay 'Talkin' 'Bout AI Generation: Copyright and the Generative-AI Supply Chain,' Cornell's Katherine Lee and A. Feder Cooper and James Grimmelmann of Microsoft Research and Yale note: