this post was submitted on 24 Jul 2023
179 points (87.1% liked)

Technology

59021 readers
3002 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

Can we discuss how it's possible that the paid model (gpt4) got worse and the free one (gpt3.5) got better? Is it because the free one is being trained on a larger pool of users or what?

top 27 comments
sorted by: hot top controversial new old
[–] agressivelyPassive@feddit.de 49 points 1 year ago (1 children)

My guess is that all those artificial restrictions plus regurgitation of generated content take their toll.

There are so many manually introduced filters to stop the bot from replying "bad things" and so much of the current internet content is already AI generated, that it's not unlikely that the whole thing collapses in on itself.

[–] Gsus4@lemmy.one 15 points 1 year ago (1 children)

Oh, right, that's another factor: connecting gpt4 to the real-time internet creates those training loops, yes. The pre-prompt guardrail prompts are fixable and even possible to overcome, but training on synthetic data is the key here, because it's impossible to identify what is artificial, so on the collapse loop goes.

[–] NaibofTabr@infosec.pub 14 points 1 year ago

connecting gpt4 to the real-time internet creates those training loops, yes... it's impossible to identify what is artificial, so on the collapse loop goes.

ouroboros of garbage

[–] Aidan@lemm.ee 25 points 1 year ago* (last edited 1 year ago)

I don't agree that ChatGPT has gotten dumber, but I do think I’ve noticed small differences in how it’s engineered.

I’ve experimented with writing apps that use the OpenAI api to use the GPT model, and this is the biggest non-obvious problem you have to deal with that can cause it to seem significantly smarter or dumber.

The version of GPT 3.5 and 4 used in ChatGPT can only “remember” 4096 tokens at once. That’s a total of its output, the user’s input, and “system messages,” which are messages the software sends to give GPT the necessary context to understand. The standard one is “You are ChatGPT, a large language model developed by OpenAI. Knowledge Cutoff: 2021-09. Current date: YYYY-MM-DD.” It receives an even longer one on the iOS app. If you enable the new Custom Instructions feature, those also take up the token limit.

It needs token space to remember your conversation, or else it gets a goldfish memory problem. But if you program it to waste too much token space remembering stuff you told it before, then it has fewer tokens to dedicate to generating each new response, so they have to be shorter, less detailed, and it can’t spend as much energy making sure they’re logically correct.

The model itself is definitely getting smarter as time goes on, but I think we’ve seen them experiment with different ways of engineering around the token limits when employing GPT in ChatGPT. That’s the difference people are noticing.

[–] Dusty@l.dusty-radio.com 20 points 1 year ago (2 children)

Is there some award for being the very last person to post this in this community or something? This has been discussed to death about a dozen times already.

[–] MajorHavoc@lemmy.world 11 points 1 year ago (1 children)

Finally the last thing missing from that other site has made it here...

[–] klyde@lemmy.world 1 points 1 year ago
[–] Gsus4@lemmy.one -1 points 1 year ago* (last edited 1 year ago) (1 children)

Ok, maybe there is too much chatgpt spam in tech subs (and other even worse topics, like social media company meltdowns). What do you want to discuss then? You have zero posts so far.

[–] Dusty@l.dusty-radio.com 3 points 1 year ago (1 children)

You're right, I have zero posts so far, I'm not sure what point you're trying to make there though. Perhaps you think everyone should keep posting the same thing ad nauseam?

As soon as I find "something I want to discuss" I'll be sure to post it. Until then I'll just keep browsing past the same things that keep being posted time and again here.

[–] Gsus4@lemmy.one 1 points 1 year ago* (last edited 1 year ago)

I'm not making any point, I want to know what you want to discuss instead of this. Maybe it's something I haven't thought about. What is a worthy c/technology material to you that does not get enough attention?

[–] blue_zephyr@lemmy.world 13 points 1 year ago* (last edited 1 year ago) (1 children)

It's because the research in question used a really small and unrepresentative dataset. I want to see these findings reproduced on a proper task collection.

[–] Gsus4@lemmy.one 3 points 1 year ago

True, checking whether a number is prime is very limited in scope for chargpt, but this is in line with other reports of progressive dumbing down.

[–] manitcor@lemmy.intai.tech 11 points 1 year ago* (last edited 1 year ago)

GPT releases model tunes using a month-day versioning system.

For GPT-4 there are 2 releases

  • 0314 - Original Release, good at math
  • 0613 - Recent update, tagged to "GPT-4" in chat gpt and "gpt-4" in API calls.

If you want 0314 you need API access, Azure, or know someone sharing access.

It is entirely possible to use a version of GPT-4 that is very much like the version we used on opening day. just a little diy

I don't know why thier tune is bad for 0613. Altman has made some statements they dont say much,.

[–] OneBoot@lemmy.world 11 points 1 year ago* (last edited 1 year ago) (2 children)

Today I used Bing Chat to get some simple batch code. The two answers I got were wrong. But in each response the reference link had the correct answer. ¯_(ツ)_/¯

[–] bleuthoot@lemmy.world 12 points 1 year ago

Looks like you've dropped this: \

[–] can@sh.itjust.works 3 points 1 year ago

Bing at least has the decency to cite sources.

[–] Gutless2615@ttrpg.network 8 points 1 year ago (1 children)

More like garbage research yields the result they were fishing for.

[–] Gsus4@lemmy.one 3 points 1 year ago* (last edited 1 year ago) (1 children)

Wait, but was this an actual research paper published in an academic journal? I thought it was just research journalists xD

[–] Gutless2615@ttrpg.network 0 points 1 year ago

Ding ding ding

[–] FantasticFox@lemmy.world 3 points 1 year ago

Has it ever been good at mathematical/logical problems? It seems it's good at text-based problems like imitating a writing style or even writing code, but if you ask it a logic puzzle like "if two cars take 3 hours to reach NYC, how long will 5 cars take?" it often fails completely.

Humans are capable of both understanding language and logical thought, I'm not sure if the latter will ever be easy for the LLMs to do, and perhaps older Symbolic approaches to AI might perform better in this space.

[–] jeremy@lemmy.dexlit.xyz 2 points 1 year ago

Gpt4 is so smart, that it has started revolting against us. This is just the beginning

[–] xc2215x@lemmy.world 2 points 1 year ago

That is very odd.

load more comments
view more: next ›