this post was submitted on 22 Dec 2024
1575 points (97.4% liked)

Technology

60105 readers
2017 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 2 years ago
MODERATORS
 

It's all made from our data, anyway, so it should be ours to use as we want

(page 2) 50 comments
sorted by: hot top controversial new old
[–] brucethemoose@lemmy.world 9 points 4 days ago* (last edited 4 days ago) (3 children)

The environmental cost of training is a bit of a meme. The details are spread around, but basically, Alibaba trained a GPT-4 level-ish model on a relatively small number of GPUs... probably on par with a steel mill running for a long time, a comparative drop in the bucket compared to industrial processes. OpenAI is extremely inefficient, probably because they don't have much pressure to optimize GPU usage.

Inference cost is more of a concern with crazy stuff like o3, but this could dramatically change if (hopefully when) bitnet models come to frutition.

Still, I 100% agree with this. Closed LLM weights should be public domain, as many good models already are.

load more comments (3 replies)
[–] chiliedogg@lemmy.world 5 points 4 days ago (4 children)

Delete them. Wipe their databases. Make the companies start from scratch with new, ethically acquired training data.

load more comments (4 replies)
[–] RandomVideos@programming.dev 2 points 3 days ago (5 children)

Wouldnt that give people who is it for bad things easier access? It should be made illegal to create if they dont legally have access to that data

load more comments (5 replies)
[–] TootSweet@lemmy.world 3 points 4 days ago* (last edited 4 days ago) (1 children)

To speak of AI models being "made public domain" is to presuppose that the AI models in question are covered by some branch of intellectual property. Has it been established whether AI models (even those trained on properly licensed content) even are covered by some branch of intellectual property in any particular jurisdiction(s)? Or maybe by "public domain" the author means that they should be required to publish the weights and also that they shouldn't get any trade secret protections related to those weights?

[–] barsoap@lemm.ee 2 points 3 days ago* (last edited 3 days ago)

Unlikely, I'd say, In EU jurisdictions copyright requires creative authorship, not "sweat of the brow" which is why by default databases aren't included, which is why they're have their own protection regime.

Quote, emphasis mine:

In the meaning of the European Union Directive 96/9/EC on the legal protection of databases,the term database refers to a collection of independent works, data or other materials, which have been arranged in a systematic or methodical way, and have been made individually accessible by electronic or other means. In the meaning of the Directive the data or materials:

  • must not be linked, or must be capable of separation without losing their informative content;
  • must be organised according to specific criteria, which means that only planned collections are covered;
  • must be individually accessible – mere storage of data is not covered by the term database.

In AI models the organisation is inferred from the data, it's not planned into the database. The first bullet point is on less shaky, a summary an AI can make of a book can reasonably be regarded to be "informative content", nothing about db protections says that they have to store full works it could also be references, citations, etc.

load more comments
view more: ‹ prev next ›