this post was submitted on 31 Jul 2023
386 points (100.0% liked)
Technology
37804 readers
223 users here now
A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.
Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.
Subcommunities on Beehaw:
This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Yes, the public domain belongs to everyone.
That's correct. They can't confirm the training data didn't commit copyright infringement.
Something in the public domain means everyone essentially has the right to copy it in any form. Thus you still need the copyrights to distribute on Steam, even if that copyright is public domain.
I don't get what you mean by that because it's entirely about the copyrights of the content and if the owner is allowed to distribute them.
If something is in the public domain, there is no copyright. That’s what public domain means. Now, someone could try to place something into the public domain incorrectly that still has someone else’s copyright claim on it, but LLMs don’t do that (usually): a work created via an LLM is in the public domain. Nobody reserves any rights.
Because there are no rights reserved, there’s no copyright issues.
BUT that doesn’t mean that infringement hasn’t already been committed by the person who created the training set IF you stand by the argument that a training set has no right to include a work unless it’s in the public domain or permission has been granted by any rights holders.
That last bit I covered earlier; it is a philosophical stance people take, but it’s not the only one, and as of now it has no legal backing. Others claim fair use, which pre-empts any copyright claims. And remember, this is about creating the training set and NOT about generative works, which are in the public domain.
Yeah, in the end, that's going to come down to what is transformative work and if transformative work can be done solely by a tool.
They are only in the public domain if they are transformative works. Otherwise, they are derived works and subject to the original copyright and thus copyright infringing works.
Sure, everyone has the right to copy it. There are no copyrights given out to one person. At this point, that's just semantics.
That's the argument though. LLMs potentially are attempting to put works into the public domain by copying them, creating works based on them, then because it's not made by a human, placing them in the public domain. If the works an LLM is seen as derived from the training set and the training set is copyrighted content then an LLM is creating copyright infringing works and attempting to place them into the public domain.