this post was submitted on 11 Jul 2025

98 points (98.0% liked)

Fuck AI

3466 readers

361 users here now

"We did it, Patrick! We made a technological breakthrough!"

A place for all those who loathe AI to discuss things, post articles, and ridicule the AI hype. Proud supporter of working people. And proud booer of SXSW 2024.

founded 1 year ago

MODERATORS

VerbFlow@lemmy.world

MrMcGasion@lemmy.world

TootSweet@lemmy.world

BigMikeInAustin@lemmy.world

cynar@lemmy.world

themaninblack@lemmy.world

drmeanfeel@lemmy.world

pavnilschanda@lemmy.world

CriticalMedicine@lemmy.world

WonderfulWanderer@lemmy.world

Communist@lemmy.ml

eatCasserole@lemmy.world

SpaceNoodle@lemmy.world

NutWrench@lemmy.world

Soup@lemmy.cafe

iAvicenna@lemmy.world

Tinks@lemmy.world

wizblizz@lemmy.world

corus_kt@lemmy.world

Prandom_returns@lemm.ee

JimSamtanko@lemm.ee

TrickDacy@lemmy.world

TheFriar@lemm.ee

ArmokGoB@lemmy.dbzer0.com

HawlSera@lemm.ee

andrew_bidlaw@sh.itjust.works

MeDuViNoX@sh.itjust.works

33550336@lemmy.world

Nougat@fedia.io

Lost_My_Mind@lemmy.world

Sterile_Technique@lemmy.world

Quill7513@slrpnk.net

ogmios@sh.itjust.works

glowing_hans@sopuli.xyz

e8d79@discuss.tchncs.de

ThefuzzyFurryComrade@pawb.social

98

Why is AI so wrong all the time??? (lemmy.today)

submitted 3 days ago by Outwit1294@lemmy.today to c/fuck_ai@lemmy.world

79 comments fedilink hide all child comments

There have been multiple things which have gone wrong with AI for me but these two pushed me over the brink. This is mainly about LLMs but other AI has also not been particularly helpful for me.

Case 1

I was trying to find the music video from where a screenshot was taken.

I provided o4 mini the image and asked it where it is from. It rejected it saying that it does not discuss private details. Fair enough. I told it that it is xyz artist. It then listed three of their popular music videos, neither of which was the correct answer to my question.

Then I started a new chat and described in detail what the screenshot was. It once again regurgitated similar things.

I gave up. I did a simple reverse image search and found the answer in 30 seconds.

Case 2

I wanted a way to create a spreadsheet for tracking investments which had xyz columns.

It did give me the correct columns and rows but the formulae for calculations were off. They were almost correct most of the time but almost correct is useless when working with money.

I gave up. I manually made the spreadsheet with all the required details.

Why are LLMs so wrong most of the time? Aren’t they processing high quality data from multiple sources? I just don’t understand the point of even making these softwares if all they can do is sound smart while being wrong.

top 50 comments

sorted by: hot top controversial new old

[–] vrighter@discuss.tchncs.de 8 points 1 day ago (1 children)

no, they aren't processing high quality data from multiple sources. They're giving you a statistical average of that data. They will always be wrong by nature. Hallucinations cannot be eliminated. Anyone saying otherwise (irrelevant of how rich they are) is bullshitting.

[–] Outwit1294@lemmy.today 1 points 1 day ago (2 children)

If hallucinations cannot be eliminated, how are they decreasing them (allegedly)?

[–] ZDL@lazysoci.al 3 points 1 day ago

Actually according to studies, the most recent versions of all the major LLMbecile vendors are hallucinating more, not less.

[–] vrighter@discuss.tchncs.de 1 points 1 day ago (1 children)

by special casing a lot of things. Like expert systems, in the 80s

[–] Outwit1294@lemmy.today 1 points 1 day ago (1 children)

What do you mean?

[–] vrighter@discuss.tchncs.de 3 points 23 hours ago (1 children)

the "guardrails" they mention. They are a bunch of if/then statements looking to work around methods that the developers have found to produce undesirable outputs. It doesn't ever mean "the llm will not bo doing this again". It means "the llm wont do this when it is asked in this particular way", which always leaves the path open for "jailbreaking". Because you will almost always be able to ask a differnt way that the devs (of the guardrails, they don't have much control over the llm itself) did not anticipate.

Expert systems were kind of "if we keep adding if/then statements, we would eventually cover all the bases and get a smart, reliable system". That didn't work then. It won't work now either

[–] Outwit1294@lemmy.today 1 points 20 hours ago

I have experienced this first hand. Asking LLMs explicit things leads to “I can’t help you with that” but if I ask it in a roundabout way, it gives a straight answer.

[–] lowered_lifted@lemmy.blahaj.zone 7 points 1 day ago (1 children)

it's by design. They are literally just guessing at what part of their database should be put in next, based on the next most likely word. There is no real point to them, because they cannot know things and they are not intelligent. Check out the works of Timnit Gebru if you'd like to know more.

[–] Outwit1294@lemmy.today 1 points 1 day ago

What is they saying about AGI?

[–] belit_deg@lemmy.world 4 points 1 day ago (1 children)

I highly recommend modern day oracles or bullshit machines, two professors explain it beautifully

[–] Outwit1294@lemmy.today 2 points 20 hours ago (1 children)

Bookmarked for watching/reading this week. Will let you know my thoughts.

[–] belit_deg@lemmy.world 1 points 19 hours ago

Cool, enjoy!

[–] RvTV95XBeo@sh.itjust.works 19 points 1 day ago (1 children)

LLMs are not designed to give you objective factual answers. They're designed to guess what you want to hear, like a middle school student writing a book report for a book they never read.

[–] Outwit1294@lemmy.today 1 points 20 hours ago (1 children)

I don’t think it considers what the user wants to hear. It is concerned about what the data it has trained on would consider a logical answer.

[–] RvTV95XBeo@sh.itjust.works 1 points 20 hours ago

What the user wants to hear is usually biased in the question. "Why are vaccines good" will have a different response from "Why are vaccines bad"

Both may or may not include factual information (again, middle school student guessing at a reading assignment analogy), but they're shaped by the questioner to reaffirm your own biases.

[–] Atomic@sh.itjust.works 0 points 19 hours ago* (last edited 4 hours ago) (1 children)

AI as we know them today, will give you the most statistically probable series of data that fit the prompt.

~~You're not providing any information on which AI you used so what can we say? For all we know you used a highschoolers senior project trained on failed history essays.~~

[–] Outwit1294@lemmy.today 1 points 7 hours ago (1 children)

I did say which one I used

[–] Atomic@sh.itjust.works 2 points 4 hours ago

Oh, I missed it, sorry! I've never tried 04 mini myself.

[–] AA5B@lemmy.world 2 points 1 day ago (1 children)

It did give me the correct columns and rows but the formulae for calculations were off.

Did you tell it that? Assuming you were using an AI chat, you have the opportunity to provide additional info and have it try again.

Getting better success from LLM is a process of providing more context and refining things over iterations

For example I wanted it to generate a python data structure for me, along with lookup functions to cross reference the data. However I gave it further info about the data structures, the cross-mapping and how I wanted it normalized, and iterated a few times until I got something worth copy-pasting sections

[–] Outwit1294@lemmy.today 2 points 1 day ago

I did. It did not help

[–] Voroxpete@sh.itjust.works 96 points 3 days ago* (last edited 3 days ago) (25 children)

Aren’t they processing high quality data from multiple sources?

Here's where the misunderstanding comes in, I think. And it's not the high quality data or the multiple sources. It's the "processing" part.

It's a natural human assumption to imagine that a thinking machine with access to a huge repository of data would have little trouble providing useful and correct answers. But the mistake here is in treating these things as thinking machines.

That's understandable. A multi-billion dollar propaganda machine has been set up to sell you that lie.

In reality, LLMs are word prediction machines. They try to predict the words that would likely follow other words. They're really quite good at it. The underlying technology is extremely impressive, allowing them to approximate human conversation in a way that is quite uncanny.

But what you have to grasp is that you're not interacting with something that thinks. There isn't even an attempt to approximate a mind. Rather, what you have is a confabulation engine; a machine for producing plausible fictions. It does this by creating unbelievably huge matrices of words - literally operating in billions of dimensions at once, graphs with many times more axes than we have letters - and probabilistically associating them with each other. It's all very clever, but what it produces is 100% fake, made up, totally invented.

Now, because of the training data they've been fed, those made up answers will, depending on the question, sometimes ends up being right. For certain types of question they can actually be right quite a lot of the time. For other types of question, almost never. But the point is, they're only ever right by accident. The "AI" is always, always constructing a fiction. That fiction just sometimes aligns with reality.

load more comments (25 replies)

[–] WolfLink@sh.itjust.works 10 points 2 days ago (1 children)

LLMs are curve fitting the function of “input text” to “expected output text”.

So when you give it an input text, it generates an output text interpolated from the expected outputs for similar inputs.

That means it’s often right for very common prompts and often wrong for prompts that are subtly different from common prompts.

load more comments (1 replies)

[–] stabby_cicada@slrpnk.net 15 points 2 days ago (2 children)

Why are LLMs so wrong most of the time? Aren’t they processing high quality data from multiple sources?

Well that's the thing. LLMs don't generally "process" data as humans would. They don't understand the text they're generating. So they can't check their answers against reality.

(Except for Grok 4, but it's apparently checking its answers to make sure they agree with Elon Musk's Tweets, which is kind of the opposite of accuracy.)

I just don’t understand the point of even making these softwares if all they can do is sound smart while being wrong.

As someone who lived through the dotcom boom of the 2000s, and the crypto booms of 2017 and 2021, the AI boom is pretty obviously yet another fad. The point is to make money - from both consumers and investors - and AI is the new buzzword to bring those dollars in.

[–] ChapulinColorado@lemmy.world 7 points 2 days ago

Don’t forget IoT, where the S stands for security! Or “The Cloud”! Make sure to rebuy the junk we will deprecate in 2 years time because we love electronic waste and planned obsolescence ;)

[–] Outwit1294@lemmy.today 4 points 2 days ago (1 children)

AI is definitely a bubble and it is going to crash the stock market one day, along with bitcoin

[–] jumping_redditor@sh.itjust.works 1 points 1 day ago (1 children)

I can't wait to buy stocks when that day comes

[–] Outwit1294@lemmy.today 2 points 20 hours ago

It can’t be that far away. We have been waiting since so many years. Trump is also making an effort to crash the market.

[–] hedgehog@ttrpg.network 6 points 2 days ago (3 children)

LLM image processing doesn’t work the same way reverse image lookup does.

Tldr explanation: Multimodal LLMs turn pictures into a ~~thousand~~ 200-500 or so ~~words~~ tokens, but reverse image lookups create perceptual hashes of images and look the hash of your uploaded image up in a database.

Much longer explanation:

Multimodal LLMs (technically, LMMs - large multimodal models) use vision transformers to turn images into tokens. They use tokens for words, too, but these tokens don’t also correspond to words. There are multiple ways this could be implemented, but a common approach is to break the image down into a grid, then transform each “patch” of a specific size, e.g., 16x16, into a single token. The patches aren’t transformed individually - the whole image is processed together, in context - but it still comes out of it with basically 200 or so tokens that allow it to respond to the image, the same way it would respond to text.

Current vision transformers also struggle with spatial awareness. They embed basic positional data into the tokens but it’s fragile and unsophisticated when it comes to spatial awareness. Fortunately there’s a lot to explore in that area so I’m sure there will continue to be improvements.

One example improvement, beyond improved spatial embeddings, would be to use a dynamic vision transformers that’s dependent on the context, or that can re-evaluate an image based off new information. Outside the use of vision transformers, simply training LMMs to use other tools on images when appropriate can potentially help with many of LMM image processing’s current shortcomings.

Given all that, asking an LLM to find the album for you is like - assuming you’ve given it the ability and permission to search the web - like showing the image to someone with no context, then them to help you find what music video - that they’ve never seen, by an artist whose appearance they describe with 10-20 generic words, none of which are their name - it’s in, and to hope there were, and that they remembered, the specific details that would make it would come up in the top ten results if searched for on Google. That’s a convoluted way to say that it’s a hard task.

By contrast, reverse image lookup basically uses a perceptual hash generated for each image. It’s the tool that should be used for your particular problem, because it’s well suited for it. LLMs were the hammer and this problem was a torx screw.

Suggesting you use - or better, using a reverse image lookup tool itself - is what the LLM should do in this instance. But it would need to have been trained to think to suggest this, capable of using a tool that could do the lookup, and have both access and permission to do the lookup.

Here’s a paper that might help understand the gaps between LMMs and tasks built for that specific purpose: https://arxiv.org/html/2305.07895v7

load more comments (3 replies)

[–] ZDL@lazysoci.al 9 points 2 days ago

I was thinking about the question here and how to reframe it so that it answers itself. I think I have the right way to ask the question:

Why is a hyper-advanced game of mad libs so wrong all the time?

That would get across the point, I think.

[–] Zetta@mander.xyz 0 points 1 day ago (1 children)

¯\_(ツ)_/¯ would I get downvoted for saying skill issue lol?

I have recently used llms for troubleshooting/assistance with exposing some self hosted services publicly through a VPS I recently got. I'm not a novice but I'm no pro either when it comes to the Linux terminal.

Anyway long story short in that instance the tool (llm) was extremely helpful in not only helping me correctly implement what I wanted but also explaning/teaching me as I went. I find llms are very accurate and helpful for the types of things I use it for.

But to answer your question on why llms can be wrong, it's because they are guessing machines that just pick the next best word. They aren't smart at all, they aren't "ai" they are large language models.

[–] Outwit1294@lemmy.today -1 points 1 day ago (1 children)

It spouts out generic and outdated answers when asked specific questions, which I can identify as wrong (skill issue, lol).

If you are super confident with using them, maybe you are really not knowledgeable enough about those things. Skill issue, I guess.

[–] Zetta@mander.xyz 3 points 22 hours ago (1 children)

There is a lot of hard data showing they are effective tools when used correctly, I realize we are in "FuckAI" and you're likely biased. Just looks at this whole comment section of people talking about how they use the tools effectively.

[–] Outwit1294@lemmy.today 1 points 20 hours ago

I became biased after I used the products. I have no ethical concerns about AI, like most of this community.

load more comments