this post was submitted on 11 Jan 2024

233 points (100.0% liked)

Technology

39466 readers

409 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:

This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 3 years ago

MODERATORS

alyaza@beehaw.org

TheRtRevKaiser@beehaw.org

gyrfalcon@beehaw.org

rs5th@beehaw.org

coldredlight@beehaw.org

SemioticStandard@beehaw.org

TheRtRevKaiser@kbin.social

remington@beehaw.org

233

OpenAI says it’s “impossible” to create useful AI models without copyrighted material (arstechnica.com)

submitted 2 years ago by sculd@beehaw.org to c/technology@beehaw.org

114 comments fedilink hide all child comments

Apparently, stealing other people's work to create product for money is now "fair use" as according to OpenAI because they are "innovating" (stealing). Yeah. Move fast and break things, huh?

"Because copyright today covers virtually every sort of human expression—including blogposts, photographs, forum posts, scraps of software code, and government documents—it would be impossible to train today’s leading AI models without using copyrighted materials," wrote OpenAI in the House of Lords submission.

OpenAI claimed that the authors in that lawsuit "misconceive[d] the scope of copyright, failing to take into account the limitations and exceptions (including fair use) that properly leave room for innovations like the large language models now at the forefront of artificial intelligence."

you are viewing a single comment's thread
view the rest of the comments

[–] noorbeast@lemmy.zip 52 points 2 years ago* (last edited 2 years ago) (4 children)

I will repeat what I have proffered before:

If OpenAI stated that it is impossible to train leading AI models without using copyrighted material, then, unpopular as it may be, the preemptive pragmatic solution should be pretty obvious, enter into commercial arrangements for access to said copyrighted material.

Claiming a failure to do so in circumstances where the subsequent commercial product directly competes in a market seems disingenuous at best, given what I assume is the purpose of copyrighted material, that being to set the terms under which public facing material can be used. Particularly if regurgitation of copyrighted material seems to exist in products inadequately developed to prevent such a simple and foreseeable situation.

Yes I am aware of the USA concept of fair use, but the test of that should be manifestly reciprocal, for example would Meta allow what it did to MySpace, hack and allow easy user transfer, or Google with scraping Youtube.

To me it seems Big Tech wants its cake and to eat it, where investor $$$ are used to corrupt open markets and undermine both fundamental democratic State social institutions, manipulate legal processes, and undermine basic consumer rights.

[–] sculd@beehaw.org 34 points 2 years ago (1 children)

Agreed.

There is nothing "fair" about the way Open AI steals other people's work. ChatGPT is being monetized all over the world and the large number of people whose work has not been compensated will never see a cent of that money.

At the same time the LLM will be used to replace (at least some of ) the people who created those works in the first place.

Tech bros are disgusting.

[–] Omega_Haxors@lemmy.ml 12 points 2 years ago (1 children)

Tech bros are disgusting.

That's not even getting into the fraternity behavior at work, hyper-reactionary politics and, er, concerning age preferences.

[–] sculd@beehaw.org 8 points 2 years ago

Yup. I said it in another discussion before but think its relevant here.

Tech bros are more dangerous than Russian oligarchs. Oligarchs understand the people hate them so they mostly stay low and enjoy their money.

Tech bros think they are the savior of the world while destroying millions of people's livelihood, as well as destroying democracy with their right wing libertarian politics.

[–] TheFreezinSteven@beehaw.org 9 points 2 years ago* (last edited 2 years ago) (1 children)

With your logic all artists will have to pay copyright fees just to learn how to draw. All musicians will have to pay copyright fees just to learn their instrument.

I guess I should clarify by saying I'm a professional musician.

[–] chahk@beehaw.org 12 points 2 years ago* (last edited 2 years ago) (1 children)

Do musicians not buy the music that they want to listen to? Should they be allowed to torrent any MP3 they want just because they say it's for their instrument learning?

I mean I'd be all for it, but that's not what these very same corporations (including Microsoft when it comes to software) wanted back during Napster times. Now they want a separate set of rules just for themselves. No! They get to follow the same laws they force down our throats.

[–] TheFreezinSteven@beehaw.org 1 points 2 years ago* (last edited 2 years ago)

Everything you said was completely irrelevant to what I mentioned and just plain ignorant.

Since when do you buy all the music you have ever listened to?

[–] redcalcium@lemmy.institute 6 points 2 years ago* (last edited 2 years ago)

I suspect the US government will allow OpenAI to continue doing as it please to keep their competitive advantage in AI over China (which don't have problem with using copyrighted materials to train their models). They already limit selling AI-related hardware to keep their competitive advantage, so why stop there? Might as well allow OpenAI to continue using copyrighted materials to keep the competitive advantage.

[–] vexikron@lemmy.zip 3 points 2 years ago* (last edited 2 years ago)

Yep, completely agree.

Case in point: Steam has recently clarified their policies of using such Ai generated material that draws on essentially billions of both copyrighted and non copyrighted text and images.

To publish a game on Steam that uses AI gen content, you now have to verify that you as a developer are legally authorized to use all training material for the AI model for commercial purposes.

This also applies to code and code snippets generated by AI tools that function similarly, such as CoPilot.

So yeah, sorry, either gotta use MIT liscensed open source code or write your own, and you gotta do your own art.

I imagine this would also prevent you from using AI generated voice lines where you trained the model on basically anyone who did not explicitly consent to this as well, but voice gen software that doesnt use the 'train the model on human speakers' approach would probably be fine assuming you have the relevant legal rights to use such software commercially.

Not 100% sure this is Steam's policy on voice gen stuff, they focused mainly on art dialogue and code in their latest policy update, but the logic seems to work out to this conclusion.