this post was submitted on 06 Dec 2023
34 points (78.3% liked)

Technology

59385 readers
3106 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] theherk@lemmy.world 2 points 11 months ago (1 children)

Not likely. They may have tested it as an adversarial feedback tool, but it would be much more accurate and efficient to get the source data rather than paying OpenAI for maybe correct information.

They did, I believe, trick ChatGPT into exposing some of its source data though, but it was only a few hundred MB’s.

[–] Mahlzeit@feddit.de 1 points 11 months ago* (last edited 11 months ago) (1 children)

For the fine-tuning stage at the end, where you turn it into a chatbot, you need specific training data (eg OpenOrca). People have used ChatGPT to generate such data. Come to think of it, if you use Mechanical Turk, then you almost certainly include text from ChatGPT.

[–] theherk@lemmy.world 1 points 11 months ago

Yes it could be done that way, and maybe GPT models were used, but calling these API’s isn’t free and there are plenty of open and surely internal models that could be used for that purpose.