this post was submitted on 31 Oct 2023

91 points (92.5% liked)

Technology

38843 readers

573 users here now

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.

Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.

Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

founded 6 years ago

MODERATORS

MinutePhrase@lemmy.ml

Midjourney, Stability AI and DeviantArt win a victory in copyright case by artists — but the fight continues (venturebeat.com)

submitted 2 years ago by ghosthand@lemmy.ml to c/technology@lemmy.ml

21 comments fedilink hide all child comments

top 21 comments

sorted by: hot top controversial new old

[–] otter@lemmy.ca 24 points 2 years ago (2 children)

The wording of the title is a bit weird, which makes me notice how legal cases are usually worded like "weaker party succeeds/fails to change the status quo". The artists lost against the companies in this case?

Anyways, important bits here:

Orrick spends the rest of his ruling explaining why he found the artists’ complaint defective, which includes various issues, but the big one being that two of the artists — McKernan and Ortiz, did not actually file copyrights on their art with the U.S. Copyright Office.

Also, Anderson copyrighted only 16 of the hundreds of works cited in the artists’ complaint. The artists had asserted that some of their images were included in the Large-scale Artificial Intelligence Open Network (LAION) open source database of billions of images created by computer scientist/machine learning (ML) researcher Christoph Schuhmann and collaborators, which all three AI art generator programs used to train.

And then

Even if that clarity is provided and even if plaintiffs narrow their allegations to limit them to Output Images that draw upon Training Images based upon copyrighted images, I am not convinced that copyright claims based a derivative theory can survive absent ‘substantial similarity’ type allegations. The cases plaintiffs rely on appear to recognize that the alleged infringer’s derivative work must still bear some similarity to the original work or contain the protected elements of the original work.

Which eh, I'm not sure I agree with. This is a new aspect of technology that isn't properly covered by existing copyright laws. Our current laws were developed to address a state of the world that no longer exists, and using those old definitions (which I think covered issues around parodies and derivative work) doesn't make sense in this case.

This isn't some individual artist drawing something similar to someone else. This is an AI that can take in all work in existence and produce new content from that without providing any compensation. This judge seems to be saying that's an ok thing to do

[–] shuzuko@midwest.social 15 points 2 years ago

did not actually file copyrights on their art with the U.S. Copyright Office.

The way they've worded this isn't really a sufficient explanation of how this works. An artist is automatically granted copyright upon the creation of a work, so it's not that they don't have the right to protect their work. It's just that, without registration, you cannot file a lawsuit to protect your work.

Copyright exists from the moment the work is created. You will have to register, however, if you wish to bring a lawsuit for infringement of a U.S. work.

https://www.copyright.gov/help/faq/faq-general.html

However, if it's within 5 years of initial publication, they can still be granted a formal registered copyright and bring the complaint again.

[–] wahming@monyet.cc 11 points 2 years ago

Judges don't make laws, they interpret them. If the current laws don't cover said new technology, it's up to the govt to pass new laws.

[–] ghosthand@lemmy.ml 10 points 2 years ago

I tend to agree with the judge's assessment. He must make a decision based on existing law and the plaintiff's claim/argument. You're right existing law doesn't cover this aspect of technology which is why there needs to be new laws enacted by Congress. And the courts are put in a no win situation here because we've failed to establish new rules and regulations for this new technology.

The plaintiff's claim of derivative work doesn't fit here because of what has already been long established what a derivative work looks like. AI generated images aren't really derivative works.

I think rightfully, the court has told them to try again, which is ok.

[–] mindbleach@sh.itjust.works 4 points 2 years ago* (last edited 2 years ago)

Generating arbitrary new images is extremely transformative, and reducing a zillion images to a few bytes each is pretty minimal. It is really fucking difficult to believe "draw Abbey Road as a Beeple piece" would get a commissioned human artist bankrupted, if they openly referenced that artist's entire catalog, but didn't exactly reproduce any portion of it.

For language models, it's even sillier. 'The network learned English by reading books!' Uh. Yeah. As opposed to what? If it's in the library, anyone can read it. That's what it's for.

[–] BitSound@lemmy.world -4 points 2 years ago* (last edited 2 years ago) (1 children)

Good to hear that people won't be able to weaponize the legal system into holding back progress

EDIT, tl;dr from below: Advocate for open models, not copyright. It's the wrong tool for this job

[–] otter@lemmy.ca 27 points 2 years ago (2 children)

AI keeps getting cited as the next big thing that will shape the world. I think this is an appropriate time to use the legal system to make sure those changes happen in a way that won't screw everything up.

The progress will happen whether we like it or not, taking a moment to clarify rules is a good thing

[–] mindbleach@sh.itjust.works 2 points 2 years ago

I think this is an appropriate time to use the legal system to make sure those changes happen in a way that won’t screw everything up.

Tell me which rules would definitely do that without screwing it up worse, for this obscenely complicated technology that's only meaningfully existed for about a year. I could use a laugh.

[–] BitSound@lemmy.world 2 points 2 years ago (1 children)

The rules I've seen proposed would kill off innovation, and allow other countries to leapfrog whatever countries tried to implement them.

What rules do you think should be put in place?

[–] KoboldCoterie@pawb.social 8 points 2 years ago (1 children)

If any commercial use of AI generated art required some transfer of money from the company using it to the artists whose work was included in training the models, it'd probably be a step in the right direction.

[–] BitSound@lemmy.world 5 points 2 years ago (2 children)

Why?

[–] hoshikarakitaridia@sh.itjust.works 4 points 2 years ago (1 children)

Because the training, and therefore the datasets are an important part of the work with AI. A lot of ppl are arguing that therefore, the ppl who provided the data (e.g. artists) should get a cut of the revenue or a static fee or something similar for compensation. Because looking at a picture is deemed fine in our society, but copying it and using it for something else is seen more critically.

Btw. I am totally with you regarding the need to not hinder progress, but at the end of the day, we need to think about both the future prospects and the morality.

There was something about labels being forced to pay a cut of the revenue to all bigger artists for every CD they'd sell. I can't remember what it was exactly, but something like that could be of use here as well maybe.

[–] Dkarma@lemmy.world 2 points 2 years ago (1 children)

Let's be clear. The ai does not in any way "copy" the picture it is trained on.

[–] hoshikarakitaridia@sh.itjust.works 1 points 2 years ago (1 children)

Yes.

And let's also pin down that this is the exact issue we need more laws on. What makes an image copyrightable? When can a copyright get violated? And more specifically: whatever the AI model encompasses, can that inhibit fully copyrighted material? Can a copyrighted image be assumed by noting down all of its features?

This is the exact corner that we are fighting over currently.

[–] Dkarma@lemmy.world 0 points 2 years ago (1 children)

This has already been decided. Inspired works are not covered by copyright.

[–] hoshikarakitaridia@sh.itjust.works 1 points 2 years ago (1 children)

Inspired in the traditional sense or inspired on a basis of datasets with concrete numbers? Huge difference.

[–] Dkarma@lemmy.world 0 points 2 years ago

Lol not at all.

[–] lemmyvore@feddit.nl 4 points 2 years ago (1 children)

Because LLM needs human-produced material to work with. If the incentive to produce such material drops, generative models will start producing garbage.

It has already started to be a problem with the current LLMs that have exhausted most easily reached sources of content on the internet and are now feeding off LLM-generated content, which has resulted in a sharp drop in quality.

[–] mkhoury@lemmy.ca 4 points 2 years ago (1 children)

"It has already started to be a problem with the current LLMs that have exhausted most easily reached sources of content on the internet and are now feeding off LLM-generated content, which has resulted in a sharp drop in quality."

Do you have any sources to back that claim? LLMs are rising in quality, not dropping, afaik.

[–] lemmyvore@feddit.nl 5 points 2 years ago* (last edited 2 years ago) (1 children)

It's still being researched but there are papers that show that, mathematically, generative models cannot feed on their own output. If you see an increase in quality it's usually because their developers have added a new trove of human-generated data.

In simple terms, these models need two things to be able to generate useful output: they need external guidance about which input is good and which is bad (throughout the process), and they need both types of input to reach a certain critical mass.

Since the reliability of these models is never 100%, with every input-output cycle the quality drops.

If the model input is very well curated and restricted to known good sources it can continue to improve (and by improve I mean asymptotically approach a value which is never 100% but high enough, like over 90%). But if models are allowed to feed on generative output (being thrown back at them by social bots and website generators) their quality will take a dive.

I want to point out that this is not an AI issue. Humans don't have a 100% correct output either, and we have the exact same problem – feeding on our own online garbage. For us the trouble started showing much slower, over the last couple of decades or so, as talk about "fake news", misinformation being weaponized etc.

AI merely accelerated the process, it hit the limits of reliability much sooner. We will need to solve this issue either way, and we would have needed to solve it even if AI weren't a thing. In a way the appearance of AI helped us because it forces us to deal with the issue of information reliability sooner rather than later.

[–] BitSound@lemmy.world 4 points 2 years ago

I wouldn't be concerned about that, the mathematical models make assumptions that don't hold in the real world. There's still plenty of guidance in the loop from things such as humans up/downvoting, and people generating several to many pictures before selecting the best one to post. There's also as you say lots of places with strong human curation, such as wikipedia or official documentation for various tools. There's also the option of running better models as the tech progresses against old datasets.