45
Training Generative AI Models on Copyrighted Works Is Fair Use - Change My Mind
(mastodon.lawprofs.org)
This is a most excellent place for technology news and articles.
What constitutes fair use?
17 U.S.C. § 107
Notwithstanding the provisions of sections 17 U.S.C. § 106 and 17 U.S.C. § 106A, the fair use of a copyrighted work, including such use by reproduction in copies or phonorecords or by any other means specified by that section, for purposes such as criticism, comment, news reporting, teaching (including multiple copies for classroom use), scholarship, or research, is not an infringement of copyright.
GenAI training, at least regarding art, is neither criticism, comment, news reporting scholarship, nor research.
AI training is not done by scientists but engineers of a corporative entity with a long term profit goal.
So, by elimination, we can conclude that none of the purposes covered by the fair use doctrine apply to Generative AI training.
Q.E.D.
"Such as" means that these are examples and not an exhaustive list.
Can you explain how the 3 factors you listed rule out scholarship or research purpose? Regarding the first factor, how do you determine that AI developers are all engineers and never computer scientists?
I’d argue that the community benefit aspect of the “scholarship or research purposes”language preclude for-profit AI companies from falling under fair use. These aren’t education programs. They’re not research for the greater good. They are private entities trying to create a machine that can copy until it creates. For their own needs, not the greater good. Education has a net positive effect on society, and those stipulations in the law are meant to better serve the whole.
If these generative AI machines were being built by students, it would fall under these specifications of fair use. But the profit motive changes everything.
I’d say “fair use” pretty much covers educational and community benefit. Private companies do neither. They are stealing and reproducing for themselves, not society.
How do you get the "community benefit aspect" out of that? Also, why do feel that a profit motive is at odds with the greater good? That seems to run counter to the whole conception of US copyright. The other examples are mainly produced with a profit motive.
Okay, first of all, that was my interpretation. Because “teaching” has always been tied to education. I was extrapolating the point to argue that fair use laws are there for the sake of education and cultural growth. You can use copyrighted works for use that benefits society as a whole, I.e. education. See what I’m saying? Fair use laws were written with the benefit of all in mind, using established works to broaden education and knowledge in the community and for purposes of culture. That’s my interpretation of their entire purpose.
Oh, I actually was just talking of my own interpretation of the point of he laws, but this is from copyright.gov:
They’re saying right there the purpose if for news, discussion, and education. Cultural benefits. That proves my point, I think.
But onto this:
Because…it is. Profit is extremely limited to the entity at the top of the capitalist structure of business (on a case by case basis, I mean. Not the top of capitalism period.) “Profit” is what a business rakes in for itself. The entire concept of profit has exploitation written right into itself. All a company’s payroll is a cost and does not factor into profit. So literally if I pay my workers less, I profit more. If they’re starving? Even more profit. If I eliminate their jobs and outsource them so I can have less expenditure and more profit by making even poorer people work for even less? Boom. Fuck these workers, I can exploit and squeeze some other poor saps even harder for more profit.
“Profit” and “greater good” are diametrically opposed concepts. Profit is limited. Greater good is collective. It’s literally the entire problem with capitalism. Profit needs exploitation. The more you exploit down the line, the more profit (read: the more people I can hurt and cut out of the money, the more profit I have). Capitalism is built on the profit motive and look where that’s led us. To a time with for-profit healthcare, sweatshops, slave labor…profit necessitates exploitation. The more you can take from the greater population—whether in price paid to you or cost cut at the expense of everyone possible—the more you profit.
Like I said, profit motive is almost the exact opposite of doing something for the greater good.
News media is usually for-profit, though. Commentary and criticism is also a staple of for-profit media. That ordinary people can and do publish their own takes via the internet is much more recent than section 107.
Much of medical research is for-profit. Biontech is a for-profit company, but their covid vaccine benefits the public.
I agree that the public benefit aspect is there, but I'd go higher to find it, right to the constitution. Congress is empowered To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries. It's a fascinating turn of phrase. Congress is not quite empowered to make copyright law (limiting the freedom of the press). It is empowered to promote progress through certain means.
The whole idea is that one can serve the greater good by introducing a profit motive to the production of, among other things, creative works. Without copyright, everything would be public domain.
Come to think of it, it is kind of weird how you apply your moral views on profit to this fair use issue. You're saying that copyright owners should make a profit. If it's not fair use, then the copyright owners have to be paid, right? That means that, EG, newspapers, like the NYT, can demand money for training on its archive. That's all paid off. The production cost has been recouped (or not). Any licensing payments now are pure profit. AI developers still have to put in the work to actually develop the AI.
Some good points here.
And I’d argue that the for-profit aspect of every single one of those institutions has corrupted and degraded the purpose and quality of each. For-profit news turned what was once a public service into what we have today: agenda-driven corporations tarnishing information for their own ends. Universities driving kids into lifelong debt. And in the case of the Covid vaccine, they took public funds and then privatized the medicine for profit.
Profit corrupted every single one of these fields.
Im not saying that a profit motive absolutely negates any positive outcome. But eliminating the profit motive eliminates selfishness. Profit is the end goal. And think about any example you can in which something good came out of a company’s desire for profit. Any example has immediate diminishing returns because while putting a new vaccine, say, onto the market that was driven by a company’s profit motive immediately loses the benefit for the greater good because it’s not the end goal. The end goal is profit. So access for the poor is immediately out of reach. Because of profit.
The motivation for development might have been driven by profit, and new discoveries come about from a company’s r&d. Great. But immediately a problem occurs when access is limited to funds. So I see what you’re saying, capitalists love to say “competition spurs innovation,” but that only goes so far, if it’s even true in the first place.
And think about public development of anything—it’s immediately sold to the highest bidder and paywalled. How about Volvo and the three point seatbelt. Did profit motive drive the discovery of that feature? Presumably, to some degree. But they immediately made it accessible to all by eliminating the profit motive for the greater good. If hey had decided to patent it and only sell it for profit to other manufacturers, it’s a detriment to he greater good.
So again, I’m not saying that nothing good has ever been discovered or created via a profit motive, but I am saying that it corrupts the reader good by exploiting need for profit. See what I’m saying? So you’re not entirely wrong, but it’s a ethical philosophy question. When your motives are selfish/corrupt, your deeds aren’t good, even if good may come about. The motives are corrupt, so any good is nullified by said profit motive.
We can talk about what the world would look like today if humanity was always cooperative instead of implementing capitalism, what would the Industrial Revolution have looked like, etc. And maybe capitalism was, at some point, the best thing for humanity to progress. But it always should’ve been a stepping stone TO a system for the greater good. Instead, the profit motive has corrupted humanity and made a system that exploits everyone possible. Exploitation is rewarded under a system that places profit above everything else. They say you have to break a few eggs to make an omelette, and I think that maybe applies to capitalism’s place in human history. Maybe it was necessary to bring about progress in the early 20th century (although the robber barons/gilded age would suggest it was too great a price to pay), but I’d argue that, as it exists today, the profit motive is harmful and needs to be done away with. Because it runs contrary to the greater good. They are diametrically opposed.
You skipped right over "teaching".
Why is that?
Show me an application of Generative AI for teaching right now. As in, already existing.
Not teaching with AI
Teaching AI
I think their point is the law is written to benefit people. Not private companies or machines.
If this wide definition of “teaching” were acceptable, then the entire concept would cease to exist.
“You stole my paper and reproduced it for profit!”
“NOO, I’m just teaching my employees to write better. It’ll happen eventually, but we’re at the stage where reproducing something incredibly similar to your paper is necessary!”
I agree that all this needs to be examined, and some new laws and regulations should be developed. But, for good or ill, teaching is a covered use as written in the section of the law quoted above, and teaching is part of the process of training.
If anything, laws will likely have to be rewritten to adress changing technologies, but it seems disingenuous to quote a section of the law and then ignore the most relevant word in the entire text
I definitely get your point. But you don’t “teach” a machine. You program a machine. In the case of AI, technically the machine is building its own database and sort of growing and adapting as it gets more advanced.
I get your point, but I just don’t think “teaching” is even what is happening here. Like I said, if the definition were that broad, it would be rendered meaningless. Not to mention, there are so, so, so many examples of the generative AI just reproducing something specifically in the style of a known artist. Writing in the style of a specific author. It does that because we ask it to, but the point is the program is a machine for reproduction. You don’t teach something without sentience. You teach living things, you write code and make a program act in a specific way. And right now, the programs are blatantly reproducing signature pieces of work.
Now, OP mentioned we are “teaching” the machines to do things on its own. But my point is that’s not teaching. It’s reproducing and stealing. It’s not creating anything, it’s spitting out elements of what it’s absorbed. And because these machines can’t think, can’t add their own style—because what’s super fucked up is we are pretty much just discussing the machines replacing artists at the moment—these things are about experience and personality. Neither of which AI has. They ingest everything and spit back out what we ask for. And they’re spitting out elements of this or that—and in these cases, it’s intellectual property of artists and writers. And the most depressing aspect of this whole thing is that we have pretty much moved beyond the “wait, out of everything, we are teaching machines to take human creativity and expression away from…humans?” stage and just moved on to talking about whether it’s technically legal.
I agree, laws will definitely have to be rewritten. But for the sake of argument, I don’t think the letter of the law can be as broad as you’re suggesting. Interesting thought experiment for us, though. Because…no one gives a shit about our takes on the matter lol
Or the take of artists and writers. But that’s a whole different problem.
it is pretty obviously scholarship and research
It is pretty obviously Research and Development of a commercial product in many cases. Not fair use.
So fair use if it's an opensource model?
there is no stipulation that the research must be non-profit.
woosh