this post was submitted on 16 Aug 2023
1234 points (94.1% liked)

Technology

60048 readers
3269 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 2 years ago
MODERATORS
1234
Google search is over (mastodon.social)
submitted 1 year ago* (last edited 1 year ago) by JoBo@feddit.uk to c/technology@lemmy.world
 

Via @rodhilton@mastodon.social

Right now if you search for "country in Africa that starts with the letter K":

  • DuckDuckGo will link to an alphabetical list of countries in Africa which includes Kenya.

  • Google, as the first hit, links to a ChatGPT transcript where it claims that there are none, and summarizes to say the same.

This is because ChatGPT at some point ingested this popular joke:

"There are no countries in Africa that start with K." "What about Kenya?" "Kenya suck deez nuts?"

you are viewing a single comment's thread
view the rest of the comments
[–] assassin_aragorn@lemmy.world 9 points 1 year ago (1 children)

AI is its own worst enemy. If you can't identify AI output, that means AIs are going to train on AI generated content, which really hurts the model.

Its literally in everyone's best interest, including AI itself, to start leaving identification of some kind inherent to all output.

[–] diffuselight@lemmy.world 0 points 1 year ago (1 children)

Those studies are flawed. by definition when you can no longer tell the difference the difference on training is nil.

[–] pedalmore@lemmy.world 7 points 1 year ago (1 children)

It's more like successive generations of inbreeding. Unless you have perfect AI content, perfect meaning exactly mirroring the diversity of human content, the drivel will amplify over time.

[–] diffuselight@lemmy.world 4 points 1 year ago

Given chinchilla law, nobody in their right mind trains models via shotgun ingesting all data anymore. Gains are made with quality of data at this point, less than volume.